Description of problem: With latest RHEL8, in OVS offload test, testpmd in guest always has TX-errors. testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 324016 RX-missed: 0 RX-bytes: 19839158 RX-errors: 0 RX-nombuf: 0 TX-packets: 324016 TX-errors: 323856 TX-bytes: 19839158 Throughput (since last show) Rx-pps: 10 Rx-bps: 4912 Tx-pps: 10 Tx-bps: 4912 ############################################################################ testpmd> Setup OVS offload: [root@netqe24 ~]# ovs-vsctl show 37964210-e864-488c-bfe4-c2d26b9ba813 Bridge ovs_pvp_br0 Port ovs_pvp_br0 Interface ovs_pvp_br0 type: internal Port eth0 Interface eth0 Port enp4s0f0np0 Interface enp4s0f0np0 ovs_version: "2.13.1" [root@netqe24 ~]# [root@netqe24 ~]# ethtool -k enp4s0f0np0 | grep hw-tc-offload hw-tc-offload: on [root@netqe24 ~]# ethtool -k eth0 | grep hw-tc-offload hw-tc-offload: on [root@netqe24 ~]# ovs-vsctl get Open_vSwitch . other_config {hw-offload="true"} Start guest [root@netqe24 ~]# virsh dumpxml rhel_loopback_tcflower <domain type='kvm' id='2'> <name>rhel_loopback_tcflower</name> <uuid>0d81f860-655d-4ba1-bbf5-546cd2bcff4c</uuid> <memory unit='KiB'>8388608</memory> <currentMemory unit='KiB'>8388608</currentMemory> <vcpu placement='static' cpuset='6,8,10,12'>4</vcpu> <numatune> <memory mode='strict' nodeset='0'/> </numatune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Broadwell</model> <vendor>Intel</vendor> <feature policy='require' name='vme'/> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='f16c'/> <feature policy='require' name='rdrand'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaveopt'/> <feature policy='force' name='pdpe1gb'/> <feature policy='require' name='abm'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='pschange-mc-no'/> <numa> <cell id='0' cpus='0' memory='8388608' unit='KiB'/> </numa> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/opt/images/rhel-guest-image-8.0-x86_64-kvm-tc.qcow2' index='1'/> <backingStore/> <target dev='hda' bus='ide'/> <alias name='ide0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <alias name='usb'/> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <alias name='usb'/> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <alias name='usb'/> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='network'> <mac address='52:54:00:01:02:03'/> <source network='default' portid='b87aa796-7347-46ba-b08c-d2723b379097' bridge='virbr0'/> <target dev='vnet0'/> <model type='e1000'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/2'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/2'> <source path='/dev/pts/2'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x2'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </hostdev> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c235,c969</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c235,c969</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> [root@netqe24 ~]# Run testpmd in guest: testpmd -c 3 -n 4 --socket-mem 2048,0 -w 0000:00:04.0 --legacy-mem -- --burst 64 -i --rxq=1 --txq=1 --rxd=4096 --txd=1024 --coremask=2 --auto-start --port-topology=chained testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 324016 RX-missed: 0 RX-bytes: 19839158 RX-errors: 0 RX-nombuf: 0 TX-packets: 324016 TX-errors: 323856 TX-bytes: 19839158 Throughput (since last show) Rx-pps: 10 Rx-bps: 4912 Tx-pps: 10 Tx-bps: 4912 ############################################################################ testpmd> show port xstats all ###### NIC extended statistics for port 0 rx_good_packets: 324196 tx_good_packets: 324196 rx_good_bytes: 19849958 tx_good_bytes: 19849958 rx_missed_errors: 0 rx_errors: 0 tx_errors: 324036 rx_mbuf_allocation_errors: 0 rx_q0packets: 324196 rx_q0bytes: 19849958 rx_q0errors: 0 tx_q0packets: 324196 tx_q0bytes: 19849958 rx_wqe_err: 0 rx_port_unicast_packets: 321509 rx_port_unicast_bytes: 20576576 tx_port_unicast_packets: 158 tx_port_unicast_bytes: 9480 rx_port_multicast_packets: 1349 rx_port_multicast_bytes: 126754 tx_port_multicast_packets: 2721 tx_port_multicast_bytes: 220982 rx_port_broadcast_packets: 1340 rx_port_broadcast_bytes: 443540 tx_port_broadcast_packets: 1359 tx_port_broadcast_bytes: 432171 tx_packets_phy: 0 rx_packets_phy: 0 rx_crc_errors_phy: 0 tx_bytes_phy: 0 rx_bytes_phy: 0 rx_in_range_len_errors_phy: 0 rx_symbol_err_phy: 0 rx_discards_phy: 0 tx_discards_phy: 0 tx_errors_phy: 0 rx_out_of_buffer: 0 testpmd> testpmd> show port info all ********************* Infos for port 0 ********************* MAC address: 96:D2:00:0B:8F:71 Device name: 0000:00:04.0 Driver name: net_mlx5 Devargs: Connect to socket: 0 memory allocation on the socket: 0 Link status: up Link speed: 100000 Mbps Link duplex: full-duplex MTU: 1500 Promiscuous mode: enabled Allmulticast mode: disabled Maximum number of MAC addresses: 128 Maximum number of MAC addresses of hash filtering: 0 VLAN offload: strip off, filter off, extend off, qinq strip off Hash key size in bytes: 40 Redirection table size: 1 Supported RSS offload flow types: ipv4 ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv6 ipv6-frag ipv6-tcp ipv6-udp ipv6-other user defined 15 user defined 16 user defined 17 Minimum size of RX buffer: 32 Maximum configurable length of RX packet: 65536 Maximum configurable size of LRO aggregated packet: 65280 Current number of RX queues: 1 Max possible RX queues: 1024 Max possible number of RXDs per queue: 65535 Min possible number of RXDs per queue: 0 RXDs number alignment: 1 Current number of TX queues: 1 Max possible TX queues: 1024 Max possible number of TXDs per queue: 65535 Min possible number of TXDs per queue: 0 TXDs number alignment: 1 Max segment number per packet: 40 Max segment number per MTU/TSO: 40 Switch name: 0000:00:04.0 Switch domain Id: 1 Switch Port Id: 65535 testpmd> Version-Release number of selected component (if applicable): In guest: [root@localhost ~]# uname -r 4.18.0-226.el8.x86_64 [root@localhost ~]# In Host [root@netqe24 ~]# [root@netqe24 ~]# uname -r 4.18.0-228.el8.x86_64 [root@netqe24 ~]# rpm -qa | grep openvswitch openvswitch2.13-2.13.0-48.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch [root@netqe24 ~]#
Did anything show up in dmesg? Either guest or hypervisor.
[root@netqe24 ~]# virsh console rhel_loopback_tcflower Connected to domain rhel_loopback_tcflower Escape character is ^] uto-start --port-topology=chained-txq=1 --rxd=4096 --txd=1024 --coremask=2 --au EAL: Detected 4 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'PA' EAL: No available hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: PCI device 0000:00:04.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 15b3:101e net_mlx5 Interactive-mode selected Auto-start selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) [ 491.164027] device ens4np0 left promiscuous mode Port 0: 96:D2:00:0B:8F:71 Checking link statuses... Done [ 491.250092] device ens4np0 entered promiscuous mode Start automatic packet forwarding io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP allocation mode: native Logical Core 1 (socket 0) forwards packets on 1 streams: RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00 io packet forwarding packets/burst=64 nb forwarding cores=1 - nb forwarding ports=1 port 0: RX queue number: 1 Tx queue number: 1 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=4096 - RX free threshold=0 RX threshold registers: pthresh=0 hthresh=0 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=1024 - TX free threshold=0 TX threshold registers: pthresh=0 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=0 testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 0 RX-missed: 0 RX-bytes: 0 RX-errors: 0 RX-nombuf: 0 TX-packets: 0 TX-errors: 0 TX-bytes: 0 Throughput (since last show) Rx-pps: 0 Rx-bps: 0 Tx-pps: 0 Tx-bps: 0 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 0 RX-missed: 0 RX-bytes: 0 RX-errors: 0 RX-nombuf: 0 TX-packets: 0 TX-errors: 0 TX-bytes: 0 Throughput (since last show) Rx-pps: 0 Rx-bps: 0 Tx-pps: 0 Tx-bps: 0 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 93 RX-missed: 0 RX-bytes: 5580 RX-errors: 0 RX-nombuf: 0 TX-packets: 93 TX-errors: 61 TX-bytes: 5580 Throughput (since last show) Rx-pps: 1 Rx-bps: 688 Tx-pps: 1 Tx-bps: 688 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 125 RX-missed: 0 RX-bytes: 7500 RX-errors: 0 RX-nombuf: 0 TX-packets: 125 TX-errors: 93 TX-bytes: 7500 Throughput (since last show) Rx-pps: 9 Rx-bps: 4760 Tx-pps: 9 Tx-bps: 4760 ############################################################################ testpmd> Telling cores to stop... Waiting for lcores to finish... ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 388 RX-dropped: 0 RX-total: 388 TX-packets: 388 TX-dropped: 0 TX-total: 388 ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 388 RX-dropped: 0 RX-total: 388 TX-packets: 388 TX-dropped: 356 TX-total: 744 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Done. Stopping port 0... Stopping ports... Done Shutting down port 0... Closing ports... Done
Created attachment 1703031 [details] dmesg in host
Created attachment 1703032 [details] dmesg in guest
(In reply to Marcelo Ricardo Leitner from comment #1) > Did anything show up in dmesg? Either guest or hypervisor. I don't found anything meaningful to me. Thanks.
More information: Command used to send packets from netqe25: python -c 'from scapy.all import *; sendp(Ether(src="00:de:ad:00:00:02", dst="00:de:ad:00:00:01")/IP(src="192.168.214.1", dst="192.168.214.2",tos=0xff)/ICMP(), iface="enp4s0f0np0", loop=1, inter=0.1)' When using tcpdump in guest to insead testpmd, the packets can be seen in guest kernel interface. [root@localhost ~]# tcpdump -nnev -c 5 -i ens4np0 dropped privs to tcpdump tcpdump: listening on ens4np0, link-type EN10MB (Ethernet), capture size 262144 bytes 00:50:24.613110 00:de:ad:00:00:02 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 60: (tos 0xff,CE, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 28) 192.168.214.1 > 192.168.214.2: ICMP echo request, id 0, seq 0, length 8 00:50:24.713489 00:de:ad:00:00:02 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 60: (tos 0xff,CE, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 28) 192.168.214.1 > 192.168.214.2: ICMP echo request, id 0, seq 0, length 8 00:50:24.814103 00:de:ad:00:00:02 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 60: (tos 0xff,CE, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 28) 192.168.214.1 > 192.168.214.2: ICMP echo request, id 0, seq 0, length 8 00:50:24.914546 00:de:ad:00:00:02 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 60: (tos 0xff,CE, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 28) 192.168.214.1 > 192.168.214.2: ICMP echo request, id 0, seq 0, length 8 00:50:25.015103 00:de:ad:00:00:02 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 60: (tos 0xff,CE, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 28) 192.168.214.1 > 192.168.214.2: ICMP echo request, id 0, seq 0, length 8 5 packets captured 5 packets received by filter 0 packets dropped by kernel [root@localhost ~]#
Hi Qijun, Do you see this issue only with ConnectX-6 Dx? is everything OK when using ConnectX-5? From the dmesg logs I see that you have FW version 22.27.2008, can you try the latest release (version 22.28.1002) from [1]? There were FW bugs that caused issues only for ConnectX-6 Dx, I wonder if this one was fixed as well. Thanks, Alaa [1] https://www.mellanox.com/support/firmware/connectx6dx
(In reply to Alaa Hleihel (Mellanox) from comment #7) > Hi Qijun, > > Do you see this issue only with ConnectX-6 Dx? is everything OK when using > ConnectX-5? Yes, only ConnnectX-6 has the issue and ConnectX-5 not > > From the dmesg logs I see that you have FW version 22.27.2008, can you try > the latest release (version 22.28.1002) from [1]? > There were FW bugs that caused issues only for ConnectX-6 Dx, I wonder if > this one was fixed as well. > > Thanks, > Alaa > > [1] https://www.mellanox.com/support/firmware/connectx6dx In the link I cannot find the specific mode (MCX623106AN-CDAT, MT_0000000359). Which one should I choose? Thanks Qijun
(In reply to qding from comment #8) > (In reply to Alaa Hleihel (Mellanox) from comment #7) > > Hi Qijun, > > > > Do you see this issue only with ConnectX-6 Dx? is everything OK when using > > ConnectX-5? > > Yes, only ConnnectX-6 has the issue and ConnectX-5 not > Thanks, then it's probably a FW issue. > > > > From the dmesg logs I see that you have FW version 22.27.2008, can you try > > the latest release (version 22.28.1002) from [1]? > > There were FW bugs that caused issues only for ConnectX-6 Dx, I wonder if > > this one was fixed as well. > > > > Thanks, > > Alaa > > > > [1] https://www.mellanox.com/support/firmware/connectx6dx > > In the link I cannot find the specific mode (MCX623106AN-CDAT, > MT_0000000359). Which one should I choose? > Looks like the FW for this card is missing. I've asked the team to upload it, I'll let you know once it's ready.
(In reply to Alaa Hleihel (Mellanox) from comment #9) > > > [1] https://www.mellanox.com/support/firmware/connectx6dx > > > > In the link I cannot find the specific mode (MCX623106AN-CDAT, > > MT_0000000359). Which one should I choose? > > > > Looks like the FW for this card is missing. I've asked the team to upload > it, I'll let you know once it's ready. done, you can download it directly via this link: http://www.mellanox.com/downloads/firmware/fw-ConnectX6Dx-rel-22_28_1002-MCX623106AN-CDA_Ax-UEFI-14.21.16-FlexBoot-3.6.101.bin.zip
(In reply to Alaa Hleihel (Mellanox) from comment #10) > > done, you can download it directly via this link: > http://www.mellanox.com/downloads/firmware/fw-ConnectX6Dx-rel-22_28_1002- > MCX623106AN-CDA_Ax-UEFI-14.21.16-FlexBoot-3.6.101.bin.zip With the new firmware it works. Thanks. testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 974 RX-missed: 0 RX-bytes: 65032 RX-errors: 0 RX-nombuf: 0 TX-packets: 974 TX-errors: 0 TX-bytes: 65032 Throughput (since last show) Rx-pps: 9 Rx-bps: 4744 Tx-pps: 9 Tx-bps: 4744 ############################################################################ testpmd> [root@netqe24 ~]# ethtool -i enp4s0f0np0 driver: mlx5_core version: 5.0-0 firmware-version: 22.28.1002 (MT_0000000359) expansion-rom-version: bus-info: 0000:04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [root@netqe24 ~]#
Thanks, Qijun! Please make sure that all your ConnectX-6/Dx cards are updated to the latest FW. Regards, Alaa
(In reply to Alaa Hleihel (Mellanox) from comment #12) > Thanks, Qijun! > > Please make sure that all your ConnectX-6/Dx cards are updated to the latest > FW. > > Regards, > Alaa I've told the team to update the firmware. Thank you.