Bug 1741626

Summary: VM masquerade binding not working on RHEL7 worker nodes (OCP 4.2)
Product: Container Native Virtualization (CNV) Reporter: Jenifer Abrams <jhopper>
Component: DocumentationAssignee: Andrew Burden <aburden>
Status: CLOSED ERRATA QA Contact: Irina Gulina <igulina>
Severity: high Docs Contact:
Priority: high    
Version: 2.0CC: aburden, atragler, cnv-qe-bugs, danken, dcbw, egarver, myakove, ncredi, phoracek, psutter, sscheink
Target Milestone: ---Keywords: Reopened
Target Release: 2.2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.1, 2.2 Doc Type: Release Note
Doc Text:
`masquerade` VM binding method is not working nor supported on RHEL7 worker nodes.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-30 16:27:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jenifer Abrams 2019-08-15 15:54:34 UTC
VM binding type of 'masquerade' does not work properly for the pod network on RHEL 7.6 nodes. Note: masquerade binding works fine on CoreOS 4.2 nodes. 

Also note the 'bridge' binding type is currently working ok for the pod network on RHEL 7.6 nodes (although I know there are plans to remove this).

Sebastian Scheinkman has been helping debug this issue on my cluster. 

Version-Release number of selected component (if applicable):
Running OCP 4.2 (4.2.0-0.ci-2019-07-31-123929-kni.0) + CNV 2.0 deployed via HCO, baremetal cluster w/ RHEL 7.6 worker nodes and CoreOS 4.2 master nodes. 

How reproducible:
Everytime


Steps to Reproduce:
1.Start two VMs w/ masquerade binding:
          interfaces:
          - name: default
            masquerade: {}
     [...]
      networks:
      - name: default
        pod: {}

2. Get pod IPs of VMs:
$ oc get pod -o wide
NAME                                               READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
virt-launcher-vm-0-c9v66                           1/1     Running   0          19h     10.128.4.33   worker-0   <none>           <none>
virt-launcher-vm-2-x6pjv                           1/1     Running   0          19h     10.128.4.34   worker-0   <none>           <none>

3. Get to the console of one VM, try to ping the other VM and it will fail. 

The same steps above work fine if the VM is started on a CoreOS node.

Comment 1 Sebastian Scheinkman 2019-08-15 16:43:20 UTC
I was debugging this issue here is some more context.

the cnv images we are using use the ubi8 image base and rhel 8.

This image have a iptables version:
iptables v1.8.2 (nf_tables)

This mean the iptables binary is a wrapper for the nftable commands.
The rhel 7.6 kernel is 
Linux vm-2 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

nftable is supported from >= 3.13

I try to find a iptables-legacy package but wasn't able to find it.

Have have an issue with other combination.

virt-launcher pod was fedora30 and the host was CoreOS8.

from the fedora30 run iptables --version:
iptables v1.8.2 (legacy)

To fix that issue we introduce the nftables rule creation if the iptables fails.

https://github.com/kubevirt/kubevirt/pull/2430

The question here is if we have a way to support the legacy iptables binary in the ubi8 image.


from fedora30:
yum provides iptables-legacy
Last metadata expiration check: 0:43:31 ago on Thu 15 Aug 2019 02:21:32 PM UTC.
iptables-1.8.2-1.fc30.x86_64 : Tools for managing Linux kernel packet filtering capabilities
Repo        : @System
Matched from:
Filename    : /usr/sbin/iptables-legacy


from ubi8:

yum provides iptables-legacy
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 0:01:25 ago on Thu Aug 15 15:01:34 2019.
Error: No Matches found

Comment 2 Dan Kenigsberg 2019-08-15 19:50:48 UTC
Better ask Anita if el8 still carries any userland code that speaks to iptables kernel.

Comment 3 Eric Garver 2019-08-19 12:51:35 UTC
(In reply to Dan Kenigsberg from comment #2)
> Better ask Anita if el8 still carries any userland code that speaks to
> iptables kernel.

It does not. RHEL-8 only has iptables-nft.

IIRC, OpenShift's solution is to mount the host's rootfs and call the host's native version of iptables. Is CNV different?

Comment 4 Sebastian Scheinkman 2019-08-19 13:58:35 UTC
Hi Eric,

Thanks for the comment.

right now we can't do it.

The iptables are create in our virt-launcher pod (represent the running virtual machine) so we can't mount the host into rootfs to this pod because the user have access to it.

This can lead to a security issue.

Comment 5 Phil Sutter 2019-08-19 16:05:36 UTC
Hi Sebastian,

(In reply to Sebastian Scheinkman from comment #4)
> The iptables are create in our virt-launcher pod (represent the running
> virtual machine) so we can't mount the host into rootfs to this pod because
> the user have access to it.
> 
> This can lead to a security issue.

So providing (read-only) access to host's rootfs to a container manipulating the host's firewall configuration may lead to a security issue? Who's auditing that setup?

Cheers, Phil

Comment 6 Eric Garver 2019-08-19 19:42:00 UTC
Can you explicitly state what RHEL version are used on the host, containers/pods, and virt-launcher pods?

In general if there is a mismatch between container/host then the host's iptables must be used. This is the case in OpenShift.

See here:
 - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
 - https://github.com/openshift/cluster-network-operator/blob/master/bindata/network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

Comment 7 Eric Garver 2019-08-19 19:43:18 UTC
(In reply to Eric Garver from comment #6)
> Can you explicitly state what RHEL version are used on the host,
> containers/pods, and virt-launcher pods?
> 
> In general if there is a mismatch between container/host then the host's
> iptables must be used. This is the case in OpenShift.
> 
> See here:
>  - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
>  -
> https://github.com/openshift/cluster-network-operator/blob/master/bindata/
> network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

*NOTE the read-only mount

Comment 8 Jenifer Abrams 2019-08-19 20:11:15 UTC
(In reply to Eric Garver from comment #6)
> Can you explicitly state what RHEL version are used on the host,
> containers/pods, and virt-launcher pods?

My combination was RHEL 7.6 worker node: 3.10.0-957.10.1.el7.x86_64
With CNV 2.0, the virt-launcher uses ubi8: https://access.redhat.com/containers/#/registry.access.redhat.com/container-native-virtualization/virt-launcher/images/v2.0.0-39

I believe Sebastian's report about the other mixing issue is for upstream Kubevirt w/ virt-launcher using fc30 on a CoreOS4.2 node. I will let the CNV team speak to how they want to handle the mismatch cases. 

> 
> In general if there is a mismatch between container/host then the host's
> iptables must be used. This is the case in OpenShift.
> 
> See here:
>  - https://github.com/openshift/sdn/blob/master/images/node/Dockerfile#L22
>  -
> https://github.com/openshift/cluster-network-operator/blob/master/bindata/
> network/openshift-sdn/sdn.yaml#L126 (not the read-only mount)

Comment 19 Dan Kenigsberg 2019-11-21 19:39:14 UTC
Dan W is correct about how cnv can fix this deficiency, but we are unlikely to address it soon.

Please document that in the context of cnv-2.x, the `masquerade` binding method is not supported on el7 nodes.

Comment 20 Andrew Burden 2019-11-22 12:19:30 UTC
Thanks Dan. 
Known Issue added to 2.1 Release Notes: "The `masquerade` binding method for virtual machines cannot be used in clusters with RHEL 7 compute nodes."

PR: https://github.com/openshift/openshift-docs/pull/18255

Nelly, there doesn't seem to be a QE contact assigned to this bug. Can you please assign someone for review?

Comment 22 Nelly Credi 2019-11-25 08:08:53 UTC
please add fixed in version

Comment 23 Andrew Burden 2019-11-25 12:37:50 UTC
Right, yes, forgot I need to show the 'advanced fields' now for QE contact.

Fixed in 2.1 and 2.2 because this release note will be published for 2.1 and I've made a note to retain it for subsequent versions as it will continue to be relevant.

Comment 24 Irina Gulina 2019-12-13 14:59:39 UTC
If there is any RFE opened to fix this in the future as the comment #1741626#c19 by Dan states, I would recommend refer that RFE in the release note, instead of the current BZ.

Otherwise the current release note looks good.

Comment 25 Petr Horáček 2019-12-19 18:44:11 UTC
We don't have an RFE opened for this on BZ, it is tracked only internally on our Jira.

Comment 27 errata-xmlrpc 2020-01-30 16:27:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307

Comment 28 Nelly Credi 2020-04-28 11:53:03 UTC
Update: it was decided not to fix this issue ATM
I think we should remove it from known issues if we believe this is not relevant for our customers

Comment 29 Red Hat Bugzilla 2023-09-14 05:41:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days