Bug 2012559

Summary: mlx5/ice card: the packets ping failed for nic partitioning qinq case
Product: Red Hat Enterprise Linux Fast Datapath Reporter: liting <tli>
Component: openvswitch2.15Assignee: Maxime Coquelin <maxime.coquelin>
Status: NEW --- QA Contact: liting <tli>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 21.HCC: cgoncalves, ctrautma, fbaudin, fleitner, jhsiao, maxime.coquelin, ralongi
Target Milestone: ---Flags: tli: needinfo? (maxime.coquelin)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2222423    
Bug Blocks:    

Description liting 2021-10-10 09:15:27 UTC
Description of problem:
mlx5 card: the packets ping failed for nic parition qinq case

Version-Release number of selected component (if applicable):
[root@dell-per730-56 ~]# rpm -qa|grep openvswitch
openvswitch2.16-2.16.0-1.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
[root@dell-per730-56 ~]# uname -r
4.18.0-305.17.1.el8_4.x86_64
[root@dell-per740-03 ~]# rpm -qa|grep dpdk
dpdk-tools-20.11-3.el8.x86_64
dpdk-20.11-3.el8.x86_64


How reproducible:


Steps to Reproduce:
dell730-56 is server, dell740-03 is client. mlx5 card on dell730-56 connect mlx5 card on dell740-03. build same ovs dpdk topo in both system, just ip address are different.
For example, in dell730-56:
1. create two vfs for ens4f0 port and configure mac and vlan
ip link set ens4f0 vf 1 mac 52:54:00:11:8f:e9
ip link set ens4f0 vf 1 vlan 4
ip link set ens4f0 vf 1 trust on

2. build ovs dpdk topo
ovs-vsctl set Open_vSwitch . 'other_config={}'
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024,1024
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x880000880000
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
ovs-vsctl add-port ovsbr1 dpdk0 -- set Interface dpdk0 type=dpdk type=dpdk options:dpdk-devargs=0000:af:00.3
ovs-vsctl add-port ovsbr0 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient -- set Interface dpdkvhostuserclient0 options:vhost-server-path=/tmp/dpdkvhostuserclient0
ovs-vsctl add-port ovsbr0 patch-ovs-0 -- set Interface patch-ovs-0 type=patch options:peer=patch-ovs-1
ovs-vsctl add-port ovsbr1 patch-ovs-1 -- set Interface patch-ovs-1 type=patch options:peer=patch-ovs-0
ovs-vsctl set Port dpdkvhostuserclient0 vlan_mode=dot1q-tunnel tag=4
ip link set ovsbr0 up
ip link set ovsbr1 up

check the ovs dpdk topo:
[root@dell-per740-03 ~]# ovs-vsctl show
af603f06-443e-4de8-9135-84b709043520
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port patch-ovs-0
            Interface patch-ovs-0
                type: patch
                options: {peer=patch-ovs-1}
        Port dpdkvhostuserclient0
            tag: 4
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
    Bridge ovsbr1
        datapath_type: netdev
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:af:00.3"}
        Port patch-ovs-1
            Interface patch-ovs-1
                type: patch
                options: {peer=patch-ovs-0}
        Port ovsbr1
            Interface ovsbr1
                type: internal
    ovs_version: "2.16.0"

3. start guest,configure port vlan tag and ip address.
ip link add link ens9 name ens9.3 type vlan id 3 proto 802.1ad
ip link set ens9.3 up
ip addr add 20.0.0.2/24 dev ens9.3
ip addr add 2001:5c0:9168::2/24 dev ens9.3

4. inside guest ping ip.
ping the ip inside guest on dell730-56.
 ping 20.0.0.1 -c 10

Actual results:
[root@localhost ~]# ping 20.0.0.1 -c 10
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.
From 20.0.0.2 icmp_seq=1 Destination Host Unreachable
From 20.0.0.2 icmp_seq=2 Destination Host Unreachable
From 20.0.0.2 icmp_seq=3 Destination Host Unreachable
From 20.0.0.2 icmp_seq=4 Destination Host Unreachable
From 20.0.0.2 icmp_seq=5 Destination Host Unreachable

Expected results:
ping 20.0.0.1 successfully.

Additional info:
Failed job link:
https://beaker.engineering.redhat.com/jobs/5871869

guest xml:
<domain type='kvm' id='1'>
  <name>master3</name>
  <uuid>37425e76-af6a-44a6-aba0-73434afe34c0</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>5242880</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/master3.qcow2' index='1'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:11:8f:e9'/>
      <source type='unix' path='/tmp/dpdkvhostuserclient0' mode='server'/>
      <target dev='dpdkvhostuserclient0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:bb:63:7b'/>
      <source bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-master3/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>

Comment 1 Flavio Leitner 2021-10-14 11:40:16 UTC
Has this test worked before or is this the first time testing this scenario?

Comment 3 liting 2021-10-15 03:32:36 UTC
(In reply to Flavio Leitner from comment #1)
> Has this test worked before or is this the first time testing this scenario?

The previously tested scenes are not exactly the same as this one. This is the first time to test such scenes. The previous versions of such scenes should not work either Thanks.

Comment 4 liting 2022-05-23 13:17:43 UTC
It

Comment 5 liting 2022-05-23 13:18:40 UTC
The ice card also has this issue.
https://beaker.engineering.redhat.com/jobs/6642459

Comment 6 liting 2022-07-13 07:22:28 UTC
It still has this issue on mlx5 for fdp22f
https://beaker.engineering.redhat.com/jobs/6806340

Comment 7 liting 2022-08-15 03:22:40 UTC
It still has this issue on mlx5 for fdp22g
https://beaker.engineering.redhat.com/jobs/6900431
https://beaker.engineering.redhat.com/jobs/6900390

Comment 9 liting 2023-04-03 07:31:05 UTC
For fdp23.B, mlx5_core driver,
rhel8.6 still has this issue.
https://beaker.engineering.redhat.com/jobs/7670639
rhel9.2 has no this issue.
https://beaker.engineering.redhat.com/jobs/7658364

Comment 10 liting 2023-04-03 08:44:56 UTC
For mlx5_core, rhel8.8 has no this issue.
https://beaker.engineering.redhat.com/jobs/7696369

Comment 11 liting 2023-04-03 13:38:03 UTC
For ice driver, rhel9.2 still has this issue.
https://beaker.engineering.redhat.com/jobs/7696728

Comment 12 liting 2023-04-04 01:12:42 UTC
For ice driver, rhel8.8 and rhel8.6 also has this issue.
rhel8.8
https://beaker.engineering.redhat.com/jobs/7697701
rhel8.6
https://beaker.engineering.redhat.com/jobs/7697084

Comment 13 liting 2023-05-24 08:12:24 UTC
For mlx5_core,
run with RHEL-8.6.0-updates-20230104.0(kernel-4.18.0-372.40.1.el8_6.x86_64) and openvswitch2.17-2.17.0-80.el8fdp has issue
https://beaker.engineering.redhat.com/jobs/7881943
run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and openvswitch2.17-2.17.0-98.el8fdp has no issue.
https://beaker.engineering.redhat.com/jobs/7881463
run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and openvswitch3.1-3.1.0-29.el8fdp has no issue.
https://beaker.engineering.redhat.com/jobs/7881319

Comment 14 Maxime Coquelin 2023-05-25 15:39:10 UTC
Hi,

(In reply to liting from comment #13)
> For mlx5_core,
> run with RHEL-8.6.0-updates-20230104.0(kernel-4.18.0-372.40.1.el8_6.x86_64)
> and openvswitch2.17-2.17.0-80.el8fdp has issue
> https://beaker.engineering.redhat.com/jobs/7881943
> run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and
> openvswitch2.17-2.17.0-98.el8fdp has no issue.
> https://beaker.engineering.redhat.com/jobs/7881463
> run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and
> openvswitch3.1-3.1.0-29.el8fdp has no issue.
> https://beaker.engineering.redhat.com/jobs/7881319

I checked the 3 outputs and also ones in Comment 12, and all tests are passed:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Start qinq_test test
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [ 02:04:36 ] :: [   PASS   ] :: Command 'create_vf' (Expected 0, got 0)
:: [ 02:04:47 ] :: [   PASS   ] :: Command 'build_qinq_topo 0x880000880000' (Expected 0, got 0)
:: [ 02:05:10 ] :: [   PASS   ] :: Command 'start_guest3' (Expected 0, got 0)
:: [ 02:05:10 ] :: [   PASS   ] :: Command 'config_vf_mac' (Expected 0, got 0)
:: [ 02:05:42 ] :: [   PASS   ] :: Command 'guest3_config_qinq' (Expected 0, got 0)
:: [ 02:06:46 ] :: [   PASS   ] :: Command 'qinq_ping_test' (Expected 0, got 0)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Duration: 139s
::   Assertions: 6 good, 0 bad
::   RESULT: PASS (Start qinq_test test)

Maybe I'm not checking the right logs though.
If so, could you provide more info on what is the issue?

Comment 15 liting 2023-05-26 08:15:55 UTC
(In reply to Maxime Coquelin from comment #14)
> Hi,
> 
> (In reply to liting from comment #13)
> > For mlx5_core,
> > run with RHEL-8.6.0-updates-20230104.0(kernel-4.18.0-372.40.1.el8_6.x86_64)
> > and openvswitch2.17-2.17.0-80.el8fdp has issue
> > https://beaker.engineering.redhat.com/jobs/7881943
> > run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and
> > openvswitch2.17-2.17.0-98.el8fdp has no issue.
> > https://beaker.engineering.redhat.com/jobs/7881463
> > run with RHEL-8.6.0-updates-20230510.20(kernel-4.18.0-372.56.1.el8_6) and
> > openvswitch3.1-3.1.0-29.el8fdp has no issue.
> > https://beaker.engineering.redhat.com/jobs/7881319
> 
> I checked the 3 outputs and also ones in Comment 12, and all tests are
> passed:
> 
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> :::
> ::   Start qinq_test test
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> :::
> 
> :: [ 02:04:36 ] :: [   PASS   ] :: Command 'create_vf' (Expected 0, got 0)
> :: [ 02:04:47 ] :: [   PASS   ] :: Command 'build_qinq_topo 0x880000880000'
> (Expected 0, got 0)
> :: [ 02:05:10 ] :: [   PASS   ] :: Command 'start_guest3' (Expected 0, got 0)
> :: [ 02:05:10 ] :: [   PASS   ] :: Command 'config_vf_mac' (Expected 0, got
> 0)
> :: [ 02:05:42 ] :: [   PASS   ] :: Command 'guest3_config_qinq' (Expected 0,
> got 0)
> :: [ 02:06:46 ] :: [   PASS   ] :: Command 'qinq_ping_test' (Expected 0, got
> 0)
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> :::
> ::   Duration: 139s
> ::   Assertions: 6 good, 0 bad
> ::   RESULT: PASS (Start qinq_test test)
> 
> Maybe I'm not checking the right logs though.
> If so, could you provide more info on what is the issue?

This qinq_ping_test shows that the pass is an issue that my automation script did not add a judgment (I will fix it later), but from the detail log, it appears that ping ipv4 failed.
[root@localhost ~]# ping 20.0.0.1 -c 10
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.
From 20.0.0.2 icmp_seq=1 Destination Host Unreachable
From 20.0.0.2 icmp_seq=2 Destination Host Unreachable
From 20.0.0.2 icmp_seq=3 Destination Host Unreachable
From 20.0.0.2 icmp_seq=4 Destination Host Unreachable

[root@localhost ~]# ping6 2001:5c0:9168::1 -c 10
PING 2001:5c0:9168::1(2001:5c0:9168::1) 56 data bytes
From 2001:5c0:9168::2 icmp_seq=1 Destination unreachable: Address unreachable
From 2001:5c0:9168::2 icmp_seq=2 Destination unreachable: Address unreachable
From 2001:5c0:9168::2 icmp_seq=3 Destination unreachable: Address unreachable
From 2001:5c0:9168::2 icmp_seq=4 Destination unreachable: Address unreachable

write_log qinq_test FAIL

And for fdp23.E, it still has this issue for ice card.
rhel8.6 openvswitch2.17-2.17.0-98.el8fdp
https://beaker.engineering.redhat.com/jobs/7893126

Comment 16 Maxime Coquelin 2023-05-30 08:29:18 UTC
Could you please provide me access to a setup where the issue is reproduced?

Comment 17 liting 2023-06-07 05:48:44 UTC
(In reply to Maxime Coquelin from comment #16)
> Could you please provide me access to a setup where the issue is reproduced?

I prepared a setup on the system with mlx5_core driver. You can access to them to debug. When my ice card system is available, I will prepared the setup on it.

dell-per730-56.rhts.eng.pek2.redhat.com 100g mlx5_core card <--direct connect--> dell-per740-03.rhts.eng.pek2.redhat.com mlx5_core card
The account/password is root/redhat
[root@dell-per730-56 ~]# ovs-vsctl show
cf23d1cf-2851-4be6-9832-f36dd67f00db
    Bridge ovsbr0
        datapath_type: netdev
        Port patch-ovs-0
            Interface patch-ovs-0
                type: patch
                options: {peer=patch-ovs-1}
        Port dpdkvhostuserclient0
            tag: 4
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
    Bridge ovsbr1
        datapath_type: netdev
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:04:00.3"}
        Port ovsbr1
            Interface ovsbr1
                type: internal
        Port patch-ovs-1
            Interface patch-ovs-1
                type: patch
                options: {peer=patch-ovs-0}
    ovs_version: "2.17.6"

[root@dell-per740-03 ~]# ovs-vsctl show
42998083-9def-4c0b-a690-4eb5ee84d7ac
    Bridge ovsbr1
        datapath_type: netdev
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:af:00.3"}
        Port patch-ovs-1
            Interface patch-ovs-1
                type: patch
                options: {peer=patch-ovs-0}
        Port ovsbr1
            Interface ovsbr1
                type: internal
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port patch-ovs-0
            Interface patch-ovs-0
                type: patch
                options: {peer=patch-ovs-1}
        Port dpdkvhostuserclient0
            tag: 4
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
    ovs_version: "2.17.6"

You can virsh console to the guest of dell740-03, and ping the guest ip address of dell730-56. And ping failed.
[root@dell-per740-03 ~]# virsh console master3
Connected to domain 'master3'
Escape character is ^] (Ctrl + ])

[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:bb:63:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.138/24 brd 192.168.122.255 scope global dynamic noprefixroute enp2s0
       valid_lft 2741sec preferred_lft 2741sec
    inet6 fe80::27b0:29d:7b41:df57/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:8f:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::1e59:1d1a:9f3c:572b/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: enp4s0.3@enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:11:8f:e9 brd ff:ff:ff:ff:ff:ff
    inet 20.0.0.2/24 scope global enp4s0.3
       valid_lft forever preferred_lft forever
    inet6 2001:5c0:9168::2/24 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:8fe9/64 scope link 
       valid_lft forever preferred_lft forever
[root@localhost ~]# ping 20.0.0.1
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.
From 20.0.0.2 icmp_seq=1 Destination Host Unreachable

Comment 20 liting 2023-06-08 03:12:55 UTC
If you want to access to the i40e<->i40e test env, I can setup it.

Comment 22 liting 2023-06-09 04:46:16 UTC
Yes, I build a test env of #comment13 on mlx5_core system. It can ping work when add following command. You can access to the system(dell730-56/dell740-03) to debug.
 ip link set ens4f0 vf 1 vlan 4
And for ice card, it also has this issue, does it support SW Inner / HW Outer scenario?
After you confirm whether mlx5 supports SW Inner / HW Outer scenario, please let me know.

Comment 25 liting 2023-07-26 04:38:37 UTC
Thanks. I will update my test to following two tests, 
1. Full SW QinQ 
2. SW Inner / HW outer QinQ

For Full SW QinQ test, I just need to remove following setting, right?
ip link set ens4f0 vf 1 vlan 4
For SW Inner / HW outer QinQ test, it need to set vf as followings, right?  mlx5_core has bug2222423 to trace it. ice can configure it succcessfully. But i40e cannot configure it without any bug.
ip link set ens4f0 vf 1 vlan 4 proto 802.1ad

For i40e driver, it also cannot configure 802.1ad on the vf port. 
[root@dell-per730-52 ~]# ip link set enp7s0f0 vf 1 vlan 4 proto 802.1ad
RTNETLINK answers: Protocol not supported


For ice driver, it can configure 802.1ad on the vf port. 
[root@dell-per740-57 ~]# ip link set ens1f0 vf 1 vlan 4 proto 802.1ad
[root@dell-per740-57 ~]# ip link show
8: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:a5:d1:0c brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:60:2f:48:0e:8b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 52:54:00:11:8f:ea brd ff:ff:ff:ff:ff:ff, vlan 4, vlan protocol 802.1ad, spoof checking on, link-state auto, trust off

For mlx5_core driver, as comment #13 said,older rhel86 versions cannot be ping successfully, but the latest rhel8.6 versions can ping successfully. 

with configure vlan on vf port, ping successfully. It should not ping successfully, but it can ping successfully on latest rhel8.9, should we open a bug about it? 
Catch packet inside guest.
[root@localhost ~]# tcpdump -i enp4s0.3 -vven
dropped privs to tcpdump
tcpdump: listening on enp4s0.3, link-type EN10MB (Ethernet), capture size 262144 bytes
23:23:56.653342 52:54:00:11:8f:e9 > 52:54:00:11:8f:ea, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 1195, offset 0, flags [DF], proto ICMP (1), length 84)
    20.0.0.2 > 20.0.0.1: ICMP echo request, id 3, seq 21, length 64
23:23:56.653357 52:54:00:11:8f:ea > 52:54:00:11:8f:e9, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 37599, offset 0, flags [none], proto ICMP (1), length 84)

remove configure vlan on vf, also ping successfully, this is expected.
Catch packet inside guest
[root@localhost ~]# tcpdump -i enp4s0.3 -vven
dropped privs to tcpdump
tcpdump: listening on enp4s0.3, link-type EN10MB (Ethernet), capture size 262144 bytes
23:25:34.588688 52:54:00:11:8f:e9 > 52:54:00:11:8f:ea, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 15532, offset 0, flags [DF], proto ICMP (1), length 84)
    20.0.0.2 > 20.0.0.1: ICMP echo request, id 5, seq 8, length 64
23:25:34.588700 52:54:00:11:8f:ea > 52:54:00:11:8f:e9, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 21264, offset 0, flags [none], proto ICMP (1), length 84)

For i40e driver, with configure vlan on vf port, ping successfully. It should not ping successfully, but it can ping successfully on latest rhel8.9, should we open a bug about it? Thanks.

thanks,
Li Ting