Bug 2203590

Summary: No connectivity between 2 VMs over SR-IOV connection with VLAN tag
Product: Container Native Virtualization (CNV) Reporter: Yossi Segev <ysegev>
Component: NetworkingAssignee: Petr Horáček <phoracek>
Status: NEW --- QA Contact: Yossi Segev <ysegev>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.13.0CC: edwardh, omergi
Target Milestone: ---   
Target Release: 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yossi Segev 2023-05-14 10:50:00 UTC
Description of problem:
Ping fails between 2 CNV guest VMs secondary interfaces, which are backed by an SR-IOV VF and VLAN tag.


Version-Release number of selected component (if applicable):
CNV-4.13.0
ose-sriov-cni@sha256:1e71da4022477787ff8d3f2ff53fc7e86c4f6827734815291b136ef219a0c7a7 (sriov-cni-container-v4.13.0-202304211716.p0.g08b4f6a.assembly.stream, https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2475985)
ose-sriov-network-operator@sha256:f9433618ed10282ef39e0d8267fed539be0688d555c38826e3390bfdb48a27ba (sriov-network-operator-container-v4.13.0-202304211716.p0.g8471529.assembly.stream, https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2475986)


How reproducible:
100%


Steps to Reproduce:
1. On a bare-metal cluster with VLAN and SR-IOV supported - apply an SriovNetworkNodePolicy like the sriov-network-node-policy.yaml attached.

2. Apply an SriovNetwork like the attached sriov-network.yaml.
Make sure to select a VLAN tag which is enabled on your cluster.

3. Make sure a matching NetworkAttachmentDefinition was created:
$ oc get net-attach-def -n sriov-test-sriov sriov-test-network-vlan -o yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: openshift.io/sriov_net_with_vlan
  creationTimestamp: "2023-05-14T09:53:45Z"
  generation: 1
  name: sriov-test-network-vlan
  namespace: sriov-test-sriov
  resourceVersion: "19078001"
  uid: d7498ca2-566a-4b2e-ab23-5c312ef7f1ae
spec:
  config: '{ "cniVersion":"0.3.1", "name":"sriov-test-network-vlan","type":"sriov","vlan":1000,"vlanQoS":0,"ipam":{}
    }'

4. Create 2 VMs like the attached VM manifests (vm3.yaml and vm4.yaml), with secondary 
 interfaces which are based on the NetworkAttachmentDefinition, and IP addresses on the same subnet.

5. Login to one of the VMs, and try pinging the IP address of the secondary interface of the other VM:
$ virtctl console sriov-vm3-1684058028-2034686
Successfully connected to sriov-vm3-1684058028-2034686 console. The escape sequence is ^]

[fedora@sriov-vm3-1684058028-2034686 ~]$ ping 10.200.3.2
PING 10.200.3.2 (10.200.3.2) 56(84) bytes of data.
From 10.200.3.1 icmp_seq=1 Destination Host Unreachable
From 10.200.3.1 icmp_seq=2 Destination Host Unreachable
From 10.200.3.1 icmp_seq=3 Destination Host Unreachable

--- 10.200.3.2 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4099ms
pipe 4


Actual results:
Ping fails.


Expected results:
Ping should succeed and verify connectivity.


Additional info:
1. Same setup without VLAN tag (i.e. the secondary interfaces on the VMs are backed by SR-IOV connections without VLAN) works successfully.
2. Same setup without SR-IOV (i.e. the secondary interfaces on the VM reside on VLAN tagged network) works successfully, so we know VLAN is supported on the cluster(s).
3. The same issue was found on 2 different bare-metal clusters.

Comment 6 Yossi Segev 2023-05-15 18:45:52 UTC
Hi Or,

Per your questions and requests:
> Could you please share the VM and virt-launcher state after they were created?
ysegev@ysegev-fedora (bz-2203590) $
ysegev@ysegev-fedora (bz-2203590) $ oc get vm
NAME                           AGE     STATUS    READY
sriov-vm3-1684176024-1295114   2m58s   Running   True
sriov-vm4-1684176043-6111841   2m39s   Running   True
ysegev@ysegev-fedora (bz-2203590) $
ysegev@ysegev-fedora (bz-2203590) $ oc get vmi
NAME                           AGE     PHASE     IP             NODENAME                                READY
sriov-vm3-1684176024-1295114   3m2s    Running   10.129.1.110   master1.bm02-ibm.ibmc.cnv-qe.rhood.us   True
sriov-vm4-1684176043-6111841   2m42s   Running   10.128.1.20    master2.bm02-ibm.ibmc.cnv-qe.rhood.us   True
ysegev@ysegev-fedora (bz-2203590) $
ysegev@ysegev-fedora (bz-2203590) $ oc get pods
NAME                                               READY   STATUS    RESTARTS   AGE
virt-launcher-sriov-vm3-1684176024-1295114-6zmqd   2/2     Running   0          3m5s
virt-launcher-sriov-vm4-1684176043-6111841-nr4g7   2/2     Running   0          2m46s


> Are the VM attached to VFs from the exact same PF on each node?
Yes

> It sounds like a routing or packet filtering issue, could you verify with the infra team that the VLAN you are using is applicable?
It is, as I used it in another test (the one where the VM's secondary interface is connected to a VLAN-tagged network, which is connected to a standard Linux bridge and not to an SR-IOV VF).
And just to clarify - this test passed when the 2 VMs were scheduled on different nodes.

> 1. Double-checking the SriovNetworkNodePoliocy doesn't specify any interface that belongs to ovn-k networks.
If you are talking about dedicated interfaces like "br-ex" then I verified it again that it is not using any such interface.
I also tried using another PF interface, and the result is still the same.

> 2. Verify connectivity between the nodes through the SR-IOV PF interface to each VM is connected to
Successful connection between the PFs.

> 3. Run the test so that both VMs get scheduled on the same node using the same PF (traffic passed through the SR-IOV internal switch).
This scenario succeeded.

Comment 9 Edward Haas 2023-07-04 14:00:43 UTC
This is a quote from an offline message sent by @omergi on 2023-06-16:

> For further troubleshooting, I suggest the following:
> 1. test with pods:
> spin up two pods with SR-IOV interface with VLAN configuration similar to the test and check connectivity between them.
> 2. check connectivity between nodes through VFs directly: 
> On each cluster node (source and target) create VF of netdevice kind (no using vfio-pci driver) and set it with the same VLAN
> that is used in tests and check connectivity.

Please update on the results of the troubleshooting.

Comment 10 Yossi Segev 2023-07-20 15:05:43 UTC
(In reply to Edward Haas from comment #9)
> This is a quote from an offline message sent by @omergi on 2023-06-16:
> 
> > For further troubleshooting, I suggest the following:
> > 1. test with pods:
> > spin up two pods with SR-IOV interface with VLAN configuration similar to the test and check connectivity between them.
> > 2. check connectivity between nodes through VFs directly: 
> > On each cluster node (source and target) create VF of netdevice kind (no using vfio-pci driver) and set it with the same VLAN
> > that is used in tests and check connectivity.

2. This test failed - no connectivity between VFs.

I will continue to run the setup of the test 1 (pods connectivity), which requires some more setup actions as I discussed with Or.
In addition, I will verify again that the VLAN tag (1000) is indeed supported (by isolating the SR-IOV setup).

Comment 11 Yossi Segev 2023-07-25 16:02:33 UTC
After debugging, I filed https://issues.redhat.com/browse/CNV-31351, so the DevOps can verify if there is any infrastructure issues on the clusters.
Thank you Or for the cooperation.