Bug 1789352
| Summary: | Unable to run a dpdk workload without privileged=true | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Amnon Ilan <ailan> | |
| Component: | dpdk | Assignee: | David Marchand <dmarchan> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jean-Tsung Hsiao <jhsiao> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 8.1 | CC: | ailan, atragler, augol, bbennett, dmarchan, eparis, fsimonce, jhsiao, kzhang, ovs-qe, sscheink, toneata, tredaelli, zshi, zzhao | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1785933 | |||
| : | 1791410 1791411 (view as bug list) | Environment: | ||
| Last Closed: | 2020-12-15 11:04:34 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1785933 | |||
| Bug Blocks: | 1771572, 1791410, 1791411 | |||
|
Comment 5
David Marchand
2020-01-10 15:28:00 UTC
Please note too that I did not have a 8.2 system at the time, but you must test on 8.2. *** Bug 1785933 has been marked as a duplicate of this bug. *** Hi David, Got segfault! Please advise. Thanks! Jean Installed: dpdk-19.11-1.el8.x86_64 Complete! [root@netqe7 ~]# setcap cap_net_admin,cap_net_raw,cap_ipc_lock+ep $(which testpmd) [root@netqe7 ~]# sudo -u testuser testpmd -w 0000:03:00.0 -v -- -i EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: RTE Version: 'DPDK 19.11.0' EAL: Multi-process socket /tmp/dpdk/rte/mp_socket EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1007 net_mlx4 Interactive-mode selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=267456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc Port 0 is now not stopped Configuring Port 1 (socket 0) Segmentation fault [root@netqe7 ~]# rpm -q dpdk dpdk-19.11-1.el8.x86_64 [root@netqe7 ~]# uname -r 4.18.0-167.el8.x86_64 [root@netqe7 ~]# [root@netqe7 ~]# cat /etc/modprobe.d/mlx4.conf # This file is intended for users to select the various module options # they need for the mlx4 driver. On upgrade of the rdma package, # any user made changes to this file are preserved. Any changes made # to the libmlx4.conf file in this directory are overwritten on # pacakge upgrade. # # Some sample options and what they would do # Enable debugging output, device managed flow control, and disable SRIOV #options mlx4_core debug_level=1 log_num_mgm_entry_size=-1 probe_vf=0 num_vfs=0 # # Enable debugging output and create SRIOV devices, but don't attach any of # the child devices to the host, only the parent device #options mlx4_core debug_level=1 probe_vf=0 num_vfs=7 # # Enable debugging output, SRIOV, and attach one of the SRIOV child devices # in addition to the parent device to the host #options mlx4_core debug_level=1 probe_vf=1 num_vfs=7 # # Enable per priority flow control for send and receive, setting both priority # 1 and 2 as no drop priorities #options mlx4_en pfctx=3 pfcrx=3 options mlx4_core debug_level=1 log_num_mgm_entry_size=-1 options mlx4_core log_num_mgm_entry_size=-1 [root@netqe7 ~]# Can you retrieve the coredump or give me access to this system? Thanks. (In reply to David Marchand from comment #16) > Can you retrieve the coredump or give me access to this system? > Thanks. Sorry! Gave you wrong test bed! <jhsiao> HI <jhsiao> Sorry! Wrong testbed <jhsiao> Should be <jhsiao> netqe7.knqe.lab.eng.bos.redhat.com <jhsiao> 10.19.15.0/24 dev eno3 proto kernel scope link src 10.19.15.17 metric 100 <jhsiao> same passwd This is a different issue, most likely a mlx setup issue. Starting as root triggers the segfault on this system: [root@netqe7 ~]# testpmd -w 0000:03:00.0 -v -- -i EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: RTE Version: 'DPDK 19.11.0' EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1007 net_mlx4 Interactive-mode selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=267456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc Port 0 is now not stopped Port 1 is now not stopped Please stop the ports first Done Segmentation fault (core dumped) [root@netqe7 ~]# gdb $(which testpmd) # -w 0000:03:00.0 -v -- -i GNU gdb (GDB) Red Hat Enterprise Linux 8.2-8.el8 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/bin/testpmd...Reading symbols from /usr/lib/debug/usr/bin/testpmd-19.11-1.el8.x86_64.debug...done. done. (gdb) run -w 0000:03:00.0 -v --log-level *:debug -- -i The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/testpmd -w 0000:03:00.0 -v --log-level *:debug -- -i warning: Loadable section ".note.gnu.property" outside of ELF segments [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". warning: Loadable section ".note.gnu.property" outside of ELF segments EAL: Detected lcore 0 as core 2 on socket 0 EAL: Detected lcore 1 as core 0 on socket 1 EAL: Detected lcore 2 as core 3 on socket 0 EAL: Detected lcore 3 as core 1 on socket 1 EAL: Detected lcore 4 as core 4 on socket 0 EAL: Detected lcore 5 as core 4 on socket 1 EAL: Detected lcore 6 as core 5 on socket 0 EAL: Detected lcore 7 as core 5 on socket 1 EAL: Detected lcore 8 as core 2 on socket 0 EAL: Detected lcore 9 as core 0 on socket 1 EAL: Detected lcore 10 as core 3 on socket 0 EAL: Detected lcore 11 as core 1 on socket 1 EAL: Detected lcore 12 as core 4 on socket 0 EAL: Detected lcore 13 as core 4 on socket 1 EAL: Detected lcore 14 as core 5 on socket 0 EAL: Detected lcore 15 as core 5 on socket 1 EAL: Support maximum 128 logical core(s) by configuration. EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: RTE Version: 'DPDK 19.11.0' EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_bnxt.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_e1000.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_enic.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_failsafe.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_i40e.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ixgbe.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx4.so.20.0 warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx5.so.20.0 warning: Loadable section ".note.gnu.property" outside of ELF segments EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_netvsc.so.20.0 EAL: Registered [vmbus] bus. EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_nfp.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_qede.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ring.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_tap.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vdev_netvsc.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vhost.so.20.0 EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_virtio.so.20.0 EAL: Ask a virtual area of 0x5000 bytes EAL: Virtual area found at 0x100000000 (size = 0x5000) [New Thread 0x7ffff01c0700 (LWP 15655)] EAL: Multi-process socket /var/run/dpdk/rte/mp_socket [New Thread 0x7fffef9bf700 (LWP 15656)] EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or directory) EAL: VFIO PCI modules not loaded EAL: Bus pci wants IOVA as 'DC' EAL: Buses did not request a specific IOVA mode. EAL: IOMMU is available, selecting IOVA as VA mode. EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: Module /sys/module/vfio not found! error 2 (No such file or directory) EAL: VFIO modules not loaded, skipping VFIO support... EAL: Ask a virtual area of 0x2e000 bytes EAL: Virtual area found at 0x100005000 (size = 0x2e000) EAL: Setting up physically contiguous memory... EAL: Setting maximum number of open files to 4096 EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824 EAL: Detected memory type: socket_id:1 hugepage_sz:1073741824 EAL: Creating 4 segment lists: n_segs:32 socket_id:0 hugepage_sz:1073741824 EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x100033000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 0 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x140000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x940000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 0 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x980000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x1180000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 0 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x11c0000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x19c0000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 0 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x1a00000000 (size = 0x800000000) EAL: Creating 4 segment lists: n_segs:32 socket_id:1 hugepage_sz:1073741824 EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x2200000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 1 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x2240000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x2a40000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 1 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x2a80000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x3280000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 1 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x32c0000000 (size = 0x800000000) EAL: Ask a virtual area of 0x1000 bytes EAL: Virtual area found at 0x3ac0000000 (size = 0x1000) EAL: Memseg list allocated: 0x100000kB at socket 1 EAL: Ask a virtual area of 0x800000000 bytes EAL: Virtual area found at 0x3b00000000 (size = 0x800000000) EAL: TSC frequency is ~3500000 KHz EAL: Master lcore 0 is ready (tid=7ffff7fd7900;cpuset=[0]) [New Thread 0x7fffef1be700 (LWP 15657)] EAL: lcore 1 is ready (tid=7fffef1be700;cpuset=[1]) [New Thread 0x7fffee9bd700 (LWP 15658)] EAL: lcore 2 is ready (tid=7fffee9bd700;cpuset=[2]) [New Thread 0x7fffee1bc700 (LWP 15659)] EAL: lcore 3 is ready (tid=7fffee1bc700;cpuset=[3]) [New Thread 0x7fffed9bb700 (LWP 15660)] EAL: lcore 4 is ready (tid=7fffed9bb700;cpuset=[4]) [New Thread 0x7fffed1ba700 (LWP 15661)] EAL: lcore 5 is ready (tid=7fffed1ba700;cpuset=[5]) [New Thread 0x7fffec9b9700 (LWP 15662)] EAL: lcore 6 is ready (tid=7fffec9b9700;cpuset=[6]) [New Thread 0x7fffd7fff700 (LWP 15663)] [New Thread 0x7fffd77fe700 (LWP 15664)] EAL: lcore 7 is ready (tid=7fffd7fff700;cpuset=[7]) [New Thread 0x7fffd6ffd700 (LWP 15665)] [New Thread 0x7fffd67fc700 (LWP 15666)] EAL: lcore 9 is ready (tid=7fffd6ffd700;cpuset=[9]) EAL: lcore 10 is ready (tid=7fffd67fc700;cpuset=[10]) [New Thread 0x7fffd5ffb700 (LWP 15667)] EAL: lcore 11 is ready (tid=7fffd5ffb700;cpuset=[11]) EAL: lcore 8 is ready (tid=7fffd77fe700;cpuset=[8]) [New Thread 0x7fffd57fa700 (LWP 15668)] [New Thread 0x7fffd4ff9700 (LWP 15669)] EAL: lcore 13 is ready (tid=7fffd4ff9700;cpuset=[13]) [New Thread 0x7fffbffff700 (LWP 15670)] [New Thread 0x7fffbf7fe700 (LWP 15671)] EAL: lcore 15 is ready (tid=7fffbf7fe700;cpuset=[15]) EAL: lcore 14 is ready (tid=7fffbffff700;cpuset=[14]) EAL: lcore 12 is ready (tid=7fffd57fa700;cpuset=[12]) EAL: Trying to obtain current memory policy. EAL: Setting policy MPOL_PREFERRED for socket 0 EAL: Restoring previous memory policy: 0 EAL: request: mp_malloc_sync EAL: Heap on socket 0 was expanded by 1024MB EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1007 net_mlx4 EAL: Mem event callback 'MLX4_MEM_EVENT_CB:(nil)' registered EAL: Module /sys/module/vfio not found! error 2 (No such file or directory) Interactive-mode selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=267456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc EAL: Trying to obtain current memory policy. EAL: Setting policy MPOL_PREFERRED for socket 1 EAL: Restoring previous memory policy: 0 EAL: Calling mem event callback 'MLX4_MEM_EVENT_CB:(nil)' EAL: request: mp_malloc_sync EAL: Heap on socket 1 was expanded by 1024MB Port 0 is now not stopped Port 1 is now not stopped Please stop the ports first Done Thread 1 "testpmd" received signal SIGSEGV, Segmentation fault. 0x00007ffff27513cb in mlx4_flow_internal (priv=0x17ffe3180, error=0x7fffffffdd40) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/drivers/net/mlx4/mlx4_flow.c:1502 1502 while (flow && flow->internal) { (gdb) l 1497 flow->select = 1; 1498 } 1499 error: 1500 /* Clear selection and clean up stale internal flow rules. */ 1501 flow = LIST_FIRST(&priv->flows); 1502 while (flow && flow->internal) { 1503 struct rte_flow *next = LIST_NEXT(flow, next); 1504 1505 if (!flow->select) 1506 claim_zero(mlx4_flow_destroy(ETH_DEV(priv), flow, (gdb) bt full #0 0x00007ffff27513cb in mlx4_flow_internal (priv=0x17ffe3180, error=0x7fffffffdd40) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/drivers/net/mlx4/mlx4_flow.c:1502 attr = {group = 0, priority = 4095, ingress = 1, egress = 0, transfer = 0, reserved = 0} eth_spec = {dst = {addr_bytes = "\000\000i\t\201", <incomplete sequence \363>}, src = {addr_bytes = "\377\177\000\000\300\006"}, type = 62389} eth_mask = {dst = {addr_bytes = "\377\377\377\377\377\377"}, src = {addr_bytes = "\000\000\000\000\000"}, type = 0} eth_allmulti = {dst = {addr_bytes = "\001\000\000\000\000"}, src = {addr_bytes = "\000\000\000\000\000"}, type = 0} vlan_spec = {tci = 32767, inner_type = 0} vlan_mask = {tci = 65295, inner_type = 0} pattern = {{type = 4294967295, spec = 0x0, last = 0x0, mask = 0x0}, {type = RTE_FLOW_ITEM_TYPE_ETH, spec = 0x7fffffffdbf6, last = 0x0, mask = 0x7fffffffdc04}, {type = RTE_FLOW_ITEM_TYPE_END, spec = 0x0, last = 0x0, mask = 0x0}, {type = RTE_FLOW_ITEM_TYPE_END, spec = 0x0, last = 0x0, mask = 0x0}} queues = <optimized out> queue = 0x7fffffffdb30 action_rss = {func = RTE_ETH_HASH_FUNCTION_DEFAULT, level = 0, types = 0, key_len = 40, queue_num = 0, key = 0x7ffff29630a0 <mlx4_rss_hash_key_default> ",Ɓ\321[\333\364\367\374\242\203\031\333\032>\224k\236\070\331,\234\003ѭ\231D\247\331V=Y\006<%\363\374\037\334*", queue = 0x7fffffffdb30} actions = {{type = RTE_FLOW_ACTION_TYPE_RSS, conf = 0x7fffffffdbc0}, {type = RTE_FLOW_ACTION_TYPE_END, conf = 0x0}} rule_mac = 0x7fffffffdbf6 rule_vlan = 0x0 vlan = <optimized out> flow = 0x7f5be01a9500 i = <optimized out> err = 0 #1 0x00007ffff27516a5 in mlx4_flow_sync (priv=0x17ffe3180, error=error@entry=0x7fffffffdd40) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/drivers/net/mlx4/mlx4_flow.c:1549 flow = <optimized out> ret = <optimized out> #2 0x00007ffff274e0e3 in mlx4_rxmode_toggle (toggle=<optimized out>, dev=<optimized out>) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/drivers/net/mlx4/mlx4_ethdev.c:371 priv = <optimized out> mode = 0x7ffff275d724 "promiscuous" error = {type = 1434613920, cause = 0x0, message = 0x1 <error: Cannot access memory at address 0x1>} ret = <optimized out> #3 0x00007ffff54a749b in rte_eth_promiscuous_enable (port_id=port_id@entry=0) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/lib/librte_ethdev/rte_ethdev.c:2247 dev = 0x7ffff56cb100 <rte_eth_devices> diag = 0 #4 0x00005555555a74fa in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/dpdk-19.11-1.el8.x86_64/app/test-pmd/testpmd.c:3492 diag = <optimized out> port_id = 0 count = <optimized out> ret = <optimized out> I suppose this has to do with hardware/kernel module configuration. I rebooted the system and the problem is gone. testpmd starts fine as root and non root. [root@netqe7 ~]# chown testuser /dev/hugepages/rtemap* [root@netqe7 ~]# chown testuser /dev/hugepages [root@netqe7 ~]# sudo -u testuser testpmd -w 0000:03:00.0 -v -- -i EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: RTE Version: 'DPDK 19.11.0' EAL: Multi-process socket /tmp/dpdk/rte/mp_socket EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1007 net_mlx4 Interactive-mode selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=267456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) net_mlx4: 0x7fcfeb50b100: cannot attach flow rules (code 95, "Operation not supported"), flow error type 2, cause 0x1598cea40, message: flow rule rejected by device Fail to start port 0 Configuring Port 1 (socket 0) net_mlx4: 0x7fcfeb50f1c0: cannot attach flow rules (code 95, "Operation not supported"), flow error type 2, cause 0x1598cc080, message: flow rule rejected by device Fail to start port 1 Please stop the ports first Done I downgraded to dpdk 18.11.2 and the same error happens. Can you double check this setup with 18.11 on rhel 8.2 please? Hi David, By 18.11 you mean 18.11.5 or other 18.11.X where X is NOT = 2 or 5 ? Thanks! Jean Hi David, Don't know what's wrong with netqe7. I moved to netqe8 --- mirror of netqe7. It's now working --- dpdk-19.11-1 under RHEL-8.2.0. Please check the log below. Thanks! Jean [root@netqe8 ~]# sudo -u testuser testpmd -w 0000:03:00.0 -v -- -i EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: RTE Version: 'DPDK 19.11.0' EAL: Multi-process socket /tmp/dpdk/rte/mp_socket EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1007 net_mlx4 Interactive-mode selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mbuf_pool_socket_1>: n=267456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: E4:1D:2D:79:B3:11 Configuring Port 1 (socket 0) Port 1: E4:1D:2D:79:B3:12 Checking link statuses... Done testpmd> quit Stopping port 0... Stopping ports... Done Stopping port 1... Stopping ports... Done Shutting down port 0... Closing ports... Done Shutting down port 1... Closing ports... Done Bye... [root@netqe8 ~]# uname -r 4.18.0-167.el8.x86_64 [root@netqe8 ~]# rpm -q dpdk dpdk-19.11-1.el8.x86_64 [root@netqe8 ~]# So, I believe we have verified the fix with dpdk-19.11-1. Ok for me, thanks. Hi David, Just found different mlx4 firmware between netqe7 and netqe8. Please check below. Not sure if that makes the different behaviour. Thanks! Jean [root@netqe7 ~]# ethtool -i enp3s0 driver: mlx4_en version: 4.0-0 firmware-version: 2.33.5100 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [root@netqe7 ~]# [root@netqe8 ~]# ethtool -i enp3s0 driver: mlx4_en version: 4.0-0 firmware-version: 2.42.5000 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [root@netqe8 ~]# After configure /etc/rdma/mlx4.conf with "0000:03:00.0 eth eth" the mlx4_en on netqe7 is now working I knew about this kind of configuration, but did not know we had to do this on RHEL. Thanks for the tip. |