Bug 1733409

Summary: mlx5_core nic: sriov vf link state auto does not work
Product: Red Hat Enterprise Linux Fast Datapath Reporter: liting <tli>
Component: openvswitch2.11Assignee: Hekai Wang <hewang>
Status: CLOSED NOTABUG QA Contact: ovs-qe
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 19.BCC: akaris, ctrautma, jhsiao, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-09 09:12:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description liting 2019-07-26 03:15:19 UTC
Description of problem:
mlx5_core nic: sriov vf link state auto does not work

Version-Release number of selected component (if applicable):
[root@dell-per740-03 ~]# rpm -qa|grep openv
openvswitch2.11-2.11.0-9.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-11.el7fdp.noarch

[root@dell-per740-03 ~]# uname -a
Linux dell-per740-03.rhts.eng.pek2.redhat.com 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

[root@dell-per740-03 ~]# rpm -qa|grep dpdk
dpdk-18.11-4.el7_6.x86_64
dpdk-tools-18.11-4.el7_6.x86_64

[root@dell-per740-03 ~]# lspci|grep Mellanox
af:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
af:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

[root@dell-per740-03 ~]# ethtool -i p4p1
driver: mlx5_core
version: 5.0-0
firmware-version: 16.24.1000 (MT_0000000012)
expansion-rom-version: 
bus-info: 0000:af:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes


How reproducible:


Steps to Reproduce:
Two servers, dell-per730-56 and dell-per740-03, their mlx5_core nic connect back to back.
1. On both servers, create vf for physic port. 
   bind the vf to dpdk. 
   add the dpdk0 to the ovs bridge.
        /usr/share/dpdk/usertools/dpdk-devbind.py -b vfio-pci 0000:af:00.3
        systemctl restart openvswitch
        ovs-vsctl set Open_vSwitch . other_config={}
        ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
        ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
        ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x880000880000
        systemctl restart openvswitch
        ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
        config_vf_mac_tunnel
        ovs-vsctl add-port ovsbr1 dpdk0 -- set Interface dpdk0 type=dpdk type=dpdk options:dpdk-devargs=0000:af:00.3
        ovs-vsctl add-port ovsbr0 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient -- set Interface dpdkvhostuserclient0 options:vhost-server-path=/tmp/dpdkvhostuserclient0

[root@dell-per740-03 ~]# ovs-vsctl show
c289f8e6-0d00-4959-9f86-4ea31927165f
    Bridge "ovsbr0"
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                options: {dpdk-devargs="0000:af:00.3"}
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "dpdkvhostuserclient0"
            Interface "dpdkvhostuserclient0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
    ovs_version: "2.11.0"

2.  Configure the vf mac same with the guest's eth0 mac
ip link set p4p1 vf 1 mac 52:54:00:11:8f:e9
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:11:8f:e9 brd ff:ff:ff:ff:ff:ff
    inet 20.0.0.2/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:5c0:9168::2/24 scope global 
       valid_lft forever preferred_lft forever


3.  Start a guest, the guest xml as following
[root@dell-per740-03 ~]# virsh dumpxml master3
<domain type='kvm' id='1'>
  <name>master3</name>
  <uuid>37425e76-af6a-44a6-aba0-73434afe34c0</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>5242880</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/master3.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:11:8f:e9'/>
      <source type='unix' path='/tmp/dpdkvhostuserclient0' mode='server'/>
      <target dev='dpdkvhostuserclient0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-master3/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c787,c917</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c787,c917</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+1001</label>
    <imagelabel>+107:+1001</imagelabel>
  </seclabel>
</domain>

4. configure p4p1 link state down.
ip link set p4p1 down

5. configure p4p1 vf 1 link state enable, do ping test inside guest.
6. configure p4p1 vf 1 link state disable, do ping test inside guest.
7. configure p4p1 vf 1 link state auto, do ping test inside guest.

Actual results:
step 5: ping successfully
[root@localhost ~]# ping 20.0.0.1 -c 1
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.
64 bytes from 20.0.0.1: icmp_seq=1 ttl=64 time=0.080 ms

--- 20.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.080/0.080/0.080/0.000 ms

step6: ping failed
[root@localhost ~]# ping 20.0.0.1 -c 1
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.

--- 20.0.0.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

step7: ping successfully
[root@localhost ~]# ping 20.0.0.1 -c 1
PING 20.0.0.1 (20.0.0.1) 56(84) bytes of data.
64 bytes from 20.0.0.1: icmp_seq=1 ttl=64 time=0.232 ms

--- 20.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.232/0.232/0.232/0.000 ms


Expected results:
Step7 should ping failed. 
When PF p4p1 down, vf configure link state auto, should ping failed not successfully.

Additional info:

Comment 1 Andreas Karis 2019-08-08 18:12:43 UTC
I may be crazy, but shouldn't "auto" do an individual link detection and not depend on the hypervisor's interface state?

Meaning that:


. configure p4p1 link state down.
ip link set p4p1 down

>> 5. configure p4p1 vf 1 link state enable, do ping test inside guest.

This works because on L1, the physical link is up, and the VF state is set to enable

>> 6. configure p4p1 vf 1 link state disable, do ping test inside guest.

This does not work because on L1, the physical link is up, and the VF state is set to disable

>> 7. configure p4p1 vf 1 link state auto, do ping test inside guest.

This does work because on L1, the physical link is up, and the VF state is set to auto detect (and it detects that L1 is up)

Hence regardless of the p4p1 state, vf state auto should yield VF up if the physical layer is o.k.


If you want to test vf link state auto, I think you have to shut down the interface on the switch side and make sure that L1 is down.

- Andreas

Comment 2 liting 2019-08-28 08:43:16 UTC
(In reply to Andreas Karis from comment #1)
> I may be crazy, but shouldn't "auto" do an individual link detection and not
> depend on the hypervisor's interface state?
> 
> Meaning that:
> 
> 
> . configure p4p1 link state down.
> ip link set p4p1 down
> 
> >> 5. configure p4p1 vf 1 link state enable, do ping test inside guest.
> 
> This works because on L1, the physical link is up, and the VF state is set
> to enable
> 
> >> 6. configure p4p1 vf 1 link state disable, do ping test inside guest.
> 
> This does not work because on L1, the physical link is up, and the VF state
> is set to disable
> 
> >> 7. configure p4p1 vf 1 link state auto, do ping test inside guest.
> 
> This does work because on L1, the physical link is up, and the VF state is
> set to auto detect (and it detects that L1 is up)
> 
> Hence regardless of the p4p1 state, vf state auto should yield VF up if the
> physical layer is o.k.
> 
> 
> If you want to test vf link state auto, I think you have to shut down the
> interface on the switch side and make sure that L1 is down.
> 
> - Andreas

OK, I got it, if L1 is down, and vf state auto, it work well. So if you confirm auto should work when L1 is down not "ip link set p4p1 down", I will close the bug.

thanks,
Li Ting