Bug 1770119

Summary: [mlx5_core] testpmd: infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
Product: Red Hat Enterprise Linux Fast Datapath Reporter: qding
Component: DPDKAssignee: Kevin Traynor <ktraynor>
DPDK sub component: sriov QA Contact: liting <tli>
Status: NEW --- Docs Contact:
Severity: unspecified    
Priority: medium CC: ctrautma, goetz.waschk, jhsiao, knweiss, mleitner, qding, sscheink
Version: FDP 19.G   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
XML for creating guest
none
XML for attaching VF1 to guest
none
XML for attaching VF2 to guest none

Description qding 2019-11-08 08:05:47 UTC
Description of problem:

[root@localhost ~]# 
 -iot@localhost ~]# testpmd -l 0-8 -n 4 -w 03:00.0 -w 04:00.0 -- --rxq=2 --txq=2 
EAL: Detected 9 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 0000:03:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:101a net_mlx5
net_mlx5: MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
EAL: PCI device 0000:04:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:101a net_mlx5
net_mlx5: MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
Interactive-mode selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=211456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
[  674.840044] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
[  680.695218] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
[  680.697080] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
[  686.473923] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
[  692.366890] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)
[  692.368550] infiniband mlx5_0: mlx5_ib_post_send_wait:919:(pid 2105): reg umr failed (5)

Signal 2 received, preparing to exit...
LATENCY_STATS: failed to remove Rx callback for pid=0, qid=0
LATENCY_STATS: failed to remove Rx callback for pid=0, qid=1
LATENCY_STATS: failed to remove Tx callback for pid=0, qid=0
LATENCY_STATS: failed to remove Tx callback for pid=0, qid=1

Shutting down port 0...
Stopping ports...
Done
Closing ports...
Port 0 is now not stopped
Done

Shutting down port 1...
Stopping ports...
Done
Closing ports...
Done

Bye...

[root@localhost ~]# 

Version-Release number of selected component (if applicable):
[root@localhost ~]# rpm -qa | grep dpdk
dpdk-tools-18.11-4.el7_6.x86_64
dpdk-18.11-4.el7_6.x86_64
[root@localhost ~]# 
[root@localhost ~]# uname -r
3.10.0-1062.el7.x86_64
[root@localhost ~]# 

same version in host

How reproducible: 100%


Steps to Reproduce:
1. create guest
   virsh create g1.xml

2. create VFs on dual ports PFs
   echo 1 > /sys/class/net/p6p1/device/sriov_numvfs 
   echo 1 > /sys/class/net/p6p2/device/sriov_numvfs 

3. attach VF to VM
   virsh attach-device g1 vf1.xml
   virsh attach-device g1 vf2.xml

4. run testpmd in guest

Actual results:


Expected results:


Additional info:

- boot cmdline in host:

[root@dell-per730-04 perf]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-1101.el7.x86_64 root=/dev/mapper/rhel_dell--per730--04-root ro intel_iommu=on iommu=pt default_hugepagesz=1GB hugepagesz=1G hugepages=16 ksdevice=bootif crashkernel=auto rd.lvm.lv=rhel_dell-per730-04/root rd.lvm.lv=rhel_dell-per730-04/swap console=ttyS0,115200n81
[root@dell-per730-04 perf]# 

- boot cmdline in guest:

[root@localhost ~]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-1062.el7.x86_64 root=/dev/mapper/rhel-root ro intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=1 rhgb console=ttyS0,115200 crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS0
[root@localhost ~]# 


Same issue with two cards:
[root@dell-per730-04 perf]# lspci -m -s 04:00.0
04:00.0 "Ethernet controller" "Mellanox Technologies" "MT28800 Family [ConnectX-5 Ex]" "Mellanox Technologies" "Device 0002"
[root@dell-per730-04 perf]# 
[root@dell-per730-04 perf]# ethtool -i p6p1
driver: mlx5_core
version: 5.0-0
firmware-version: 16.26.1040 (MT_0000000009)
expansion-rom-version: 
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
[root@dell-per730-04 perf]# 

[root@dell-per730-05 perf]# lspci -m -s 84:00.0
84:00.0 "Ethernet controller" "Mellanox Technologies" "MT27800 Family [ConnectX-5]" "Mellanox Technologies" "Device 0007"
[root@dell-per730-05 perf]# 
[root@dell-per730-05 perf]# ethtool -i p2p1
driver: mlx5_core
version: 5.0-0
firmware-version: 16.26.1040 (MT_0000000012)
expansion-rom-version: 
bus-info: 0000:84:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
[root@dell-per730-05 perf]#

Comment 3 Alaa Hleihel (NVIDIA Mellanox) 2020-01-29 16:08:13 UTC
Hi qding,

does this still reproduce?
if so, can you please provide a setup with reproduction steps?

Thanks,
Alaa

Comment 4 qding 2020-02-05 07:28:40 UTC
Created attachment 1657761 [details]
XML for creating guest

Comment 5 qding 2020-02-05 07:29:12 UTC
Created attachment 1657762 [details]
XML for attaching VF1 to guest

Comment 6 qding 2020-02-05 07:29:36 UTC
Created attachment 1657766 [details]
XML for attaching VF2 to guest

Comment 7 qding 2020-02-05 07:30:05 UTC
(In reply to Alaa Hleihel (Mellanox) from comment #3)
> Hi qding,
> 
> does this still reproduce?
> if so, can you please provide a setup with reproduction steps?
> 

1. set bootoption in host as below and reboot system if anything changed

[root@dell-per740-10 ~]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-1125.el7.x86_64 root=/dev/mapper/rhel_dell--per740--10-root ro crashkernel=auto intel_iommu=on iommu=pt default_hugepagesz=1GB hugepagesz=1G hugepages=16 spectre_v2=retpoline rd.lvm.lv=rhel_dell-per740-10/root rd.lvm.lv=rhel_dell-per740-10/swap console=ttyS0,115200n81
[root@dell-per740-10 ~]# 


2. install required packages
   wget http://download-node-02.eng.bos.redhat.com/brewroot/packages/qemu-kvm-rhev/2.12.0/33.el7_7.8/x86_64/qemu-img-rhev-2.12.0-33.el7_7.8.x86_64.rpm
   wget http://download-node-02.eng.bos.redhat.com/brewroot/packages/qemu-kvm-rhev/2.12.0/33.el7_7.8/x86_64/qemu-kvm-common-rhev-2.12.0-33.el7_7.8.x86_64.rpm
   wget http://download-node-02.eng.bos.redhat.com/brewroot/packages/qemu-kvm-rhev/2.12.0/33.el7_7.8/x86_64/qemu-kvm-rhev-2.12.0-33.el7_7.8.x86_64.rpm
   yum -y install libvirt
   yum -y install virt-install
   yum -y install libguestfs-tools
   systemctl restart libvirtd

3. create sriov vfs
   echo 1 > /sys/bus/pci/devices/0000:5e:00.0/sriov_numvfs
   echo 1 > /sys/bus/pci/devices/0000:5e:00.1/sriov_numvfs

4. create VM and attach VFs to guest
   virsh create g1.xml 
   virsh attach-device g1 vf1.xml 
   virsh attach-device g1 vf2.xml 

   * please see the attachment for the files

5. install required packages in guest
   yum -y install http://download-node-02.eng.bos.redhat.com/brewroot/packages/dpdk/18.11.2/1.el7/x86_64/dpdk-18.11.2-1.el7.x86_64.rpm
   yum -y install http://download-node-02.eng.bos.redhat.com/brewroot/packages/dpdk/18.11.2/1.el7/x86_64/dpdk-tools-18.11.2-1.el7.x86_64.rpm
   yum -y install libibverbs

6. set bootoptions in guest as below and reboot guest if anything changed

[root@localhost ~]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-1118.el7.x86_64 root=/dev/mapper/rhel-root ro rhgb intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=2 console=ttyS0,115200 crashkernel=auto spectre_v2=retpoline rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS0
[root@localhost ~]# history

7. run testpmd
   
   testpmd -l 0,1,2 -n 4 --socket-mem 1024 --legacy-mem -w 0000:03:00.0 -w 0000:04:00.0 -- --burst=64 -i --forward-mode=mac --auto-start --nb-cores=2 --rxq=1 --txq=1 --rxd=2048 --txd=2048 --disable-rss



Thanks
Qijun