Bug 1741626 - VM masquerade binding not working on RHEL7 worker nodes (OCP 4.2) [NEEDINFO]
Summary: VM masquerade binding not working on RHEL7 worker nodes (OCP 4.2)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Documentation
Version: 2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 2.2.0
Assignee: Andrew Burden
QA Contact: Irina Gulina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-15 15:54 UTC by Jenifer Abrams
Modified: 2020-04-28 11:53 UTC (History)
11 users (show)

Fixed In Version: 2.1, 2.2
Doc Type: Release Note
Doc Text:
`masquerade` VM binding method is not working nor supported on RHEL7 worker nodes.
Clone Of:
Environment:
Last Closed: 2020-01-30 16:27:13 UTC
Target Upstream Version:
ncredi: needinfo? (myakove)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:0307 None None None 2020-01-30 16:27:23 UTC

Description Jenifer Abrams 2019-08-15 15:54:34 UTC
VM binding type of 'masquerade' does not work properly for the pod network on RHEL 7.6 nodes. Note: masquerade binding works fine on CoreOS 4.2 nodes. 

Also note the 'bridge' binding type is currently working ok for the pod network on RHEL 7.6 nodes (although I know there are plans to remove this).

Sebastian Scheinkman has been helping debug this issue on my cluster. 

Version-Release number of selected component (if applicable):
Running OCP 4.2 (4.2.0-0.ci-2019-07-31-123929-kni.0) + CNV 2.0 deployed via HCO, baremetal cluster w/ RHEL 7.6 worker nodes and CoreOS 4.2 master nodes. 

How reproducible:
Everytime


Steps to Reproduce:
1.Start two VMs w/ masquerade binding:
          interfaces:
          - name: default
            masquerade: {}
     [...]
      networks:
      - name: default
        pod: {}

2. Get pod IPs of VMs:
$ oc get pod -o wide
NAME                                               READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
virt-launcher-vm-0-c9v66                           1/1     Running   0          19h     10.128.4.33   worker-0   <none>           <none>
virt-launcher-vm-2-x6pjv                           1/1     Running   0          19h     10.128.4.34   worker-0   <none>           <none>

3. Get to the console of one VM, try to ping the other VM and it will fail. 

The same steps above work fine if the VM is started on a CoreOS node.

Comment 1 Sebastian Scheinkman 2019-08-15 16:43:20 UTC
I was debugging this issue here is some more context.

the cnv images we are using use the ubi8 image base and rhel 8.

This image have a iptables version:
iptables v1.8.2 (nf_tables)

This mean the iptables binary is a wrapper for the nftable commands.
The rhel 7.6 kernel is 
Linux vm-2 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

nftable is supported from >= 3.13

I try to find a iptables-legacy package but wasn't able to find it.

Have have an issue with other combination.

virt-launcher pod was fedora30 and the host was CoreOS8.

from the fedora30 run iptables --version:
iptables v1.8.2 (legacy)

To fix that issue we introduce the nftables rule creation if the iptables fails.

https://github.com/kubevirt/kubevirt/pull/2430

The question here is if we have a way to support the legacy iptables binary in the ubi8 image.


from fedora30:
yum provides iptables-legacy
Last metadata expiration check: 0:43:31 ago on Thu 15 Aug 2019 02:21:32 PM UTC.
iptables-1.8.2-1.fc30.x86_64 : Tools for managing Linux kernel packet filtering capabilities
Repo        : @System
Matched from:
Filename    : /usr/sbin/iptables-legacy


from ubi8:

yum provides iptables-legacy
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:01:25 ago on Thu Aug 15 15:01:34 2019.
Error: No Matches found

Comment 2 Dan Kenigsberg 2019-08-15 19:50:48 UTC
Better ask Anita if el8 still carries any userland code that speaks to iptables kernel.

Comment 3 Eric Garver 2019-08-19 12:51:35 UTC
(In reply to Dan Kenigsberg from comment #2)
> Better ask Anita if el8 still carries any userland code that speaks to
> iptables kernel.

It does not. RHEL-8 only has iptables-nft.

IIRC, OpenShift's solution is to mount the host's rootfs and call the host's native version of iptables. Is CNV different?

Comment 4 Sebastian Scheinkman 2019-08-19 13:58:35 UTC
Hi Eric,

Thanks for the comment.

right now we can't do it.

The iptables are create in our virt-launcher pod (represent the running virtual machine) so we can't mount the host into rootfs to this pod because the user have access to it.

This can lead to a security issue.

Comment 5 Phil Sutter 2019-08-19 16:05:36 UTC
Hi Sebastian,

(In reply to Sebastian Scheinkman from comment #4)
> The iptables are create in our virt-launcher pod (represent the running
> virtual machine) so we can't mount the host into rootfs to this pod because
> the user have access to it.
> 
> This can lead to a security issue.

So providing (read-only) access to host's rootfs to a container manipulating the host's firewall configuration may lead to a security issue? Who's auditing that setup?

Cheers, Phil

Comment 6 Eric Garver 2019-08-19 19:42:00 UTC
Can you explicitly state what RHEL version are used on the host, containers/pods, and virt-launcher pods?

In general if there is a mismatch between container/host then the host's iptables must be used. This is the case in OpenShift.

See here:
 - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
 - https://github.com/openshift/cluster-network-operator/blob/master/bindata/network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

Comment 7 Eric Garver 2019-08-19 19:43:18 UTC
(In reply to Eric Garver from comment #6)
> Can you explicitly state what RHEL version are used on the host,
> containers/pods, and virt-launcher pods?
> 
> In general if there is a mismatch between container/host then the host's
> iptables must be used. This is the case in OpenShift.
> 
> See here:
>  - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
>  -
> https://github.com/openshift/cluster-network-operator/blob/master/bindata/
> network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

*NOTE the read-only mount

Comment 8 Jenifer Abrams 2019-08-19 20:11:15 UTC
(In reply to Eric Garver from comment #6)
> Can you explicitly state what RHEL version are used on the host,
> containers/pods, and virt-launcher pods?

My combination was RHEL 7.6 worker node: 3.10.0-957.10.1.el7.x86_64
With CNV 2.0, the virt-launcher uses ubi8: https://access.redhat.com/containers/#/registry.access.redhat.com/container-native-virtualization/virt-launcher/images/v2.0.0-39

I believe Sebastian's report about the other mixing issue is for upstream Kubevirt w/ virt-launcher using fc30 on a CoreOS4.2 node. I will let the CNV team speak to how they want to handle the mismatch cases. 

> 
> In general if there is a mismatch between container/host then the host's
> iptables must be used. This is the case in OpenShift.
> 
> See here:
>  - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
>  -
> https://github.com/openshift/cluster-network-operator/blob/master/bindata/
> network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

Comment 19 Dan Kenigsberg 2019-11-21 19:39:14 UTC
Dan W is correct about how cnv can fix this deficiency, but we are unlikely to address it soon.

Please document that in the context of cnv-2.x, the `masquerade` binding method is not supported on el7 nodes.

Comment 20 Andrew Burden 2019-11-22 12:19:30 UTC
Thanks Dan. 
Known Issue added to 2.1 Release Notes: "The `masquerade` binding method for virtual machines cannot be used in clusters with RHEL 7 compute nodes."

PR: https://github.com/openshift/openshift-docs/pull/18255

Nelly, there doesn't seem to be a QE contact assigned to this bug. Can you please assign someone for review?

Comment 22 Nelly Credi 2019-11-25 08:08:53 UTC
please add fixed in version

Comment 23 Andrew Burden 2019-11-25 12:37:50 UTC
Right, yes, forgot I need to show the 'advanced fields' now for QE contact.

Fixed in 2.1 and 2.2 because this release note will be published for 2.1 and I've made a note to retain it for subsequent versions as it will continue to be relevant.

Comment 24 Irina Gulina 2019-12-13 14:59:39 UTC
If there is any RFE opened to fix this in the future as the comment #1741626#c19 by Dan states, I would recommend refer that RFE in the release note, instead of the current BZ.

Otherwise the current release note looks good.

Comment 25 Petr Horáček 2019-12-19 18:44:11 UTC
We don't have an RFE opened for this on BZ, it is tracked only internally on our Jira.

Comment 27 errata-xmlrpc 2020-01-30 16:27:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307

Comment 28 Nelly Credi 2020-04-28 11:53:03 UTC
Update: it was decided not to fix this issue ATM
I think we should remove it from known issues if we believe this is not relevant for our customers


Note You need to log in before you can comment on or make changes to this bug.