Bug 1634159
| Summary: | Enable support for Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD on power | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | David J. Wilder <wilder> | ||||||
| Component: | openvswitch2.10 | Assignee: | Timothy Redaelli <tredaelli> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Ping Zhang <pizhang> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 7.5 | CC: | ahleihel, atragler, ctrautma, kzhang, linville, mleitner, noas, qding, rkhan, tredaelli, wilder | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | openvswitch2.10-2.10.0-10.el7fdb.1 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-07-20 15:29:03 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Hi Alaa Can you please work with David and Tim to resolve this. Hi David, (In reply to David J. Wilder from comment #0) > Apply the attached patch to correct build issue on power when MLX support is > enabled. Which RHEL and rdma-core versions did you use ? This issue should not happen with rdma-core v17. It was already fixed (mlx5dv.h will include <sys/types.h>). Thanks, Alaa (In reply to Alaa Hleihel from comment #3) > Hi David, > > (In reply to David J. Wilder from comment #0) > > Apply the attached patch to correct build issue on power when MLX support is > > enabled. > > Which RHEL and rdma-core versions did you use ? > This issue should not happen with rdma-core v17. It was already fixed > (mlx5dv.h will include <sys/types.h>). > > Thanks, > Alaa Hi Alaa You are correct, I upgraded rdma-core to v17.2 and openvswitch built with out my patch and MLX5 enabled. Thanks David Hi Anita, Following our discussion, we will assign the BZ to Red Hat, to make progress with this request. Regards, Noa Tim, assigning to you as this are "just" config changes now, assuming David will help test the updated packages. rdma-core on RHEL7 is already at the version mentioned in comment #4. I don't have much experience with ppc64 but please let me know if I can help anyhow. Thanks Thanks Timothy, I will test it. I am hitting a ovs-vswitchd crash with openvswitch2.10 on p9.
The problem happens when running the PVP test.
gdb -c core
....
Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfi'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007fff98210ed0 in mlx5_tx_complete (txq=0x7ff88d88c380)
at /usr/src/debug/openvswitch-2.10.0/dpdk-17.11/drivers/net/mlx5/mlx5_rxtx.h:481
481 free[blk_n++] = m;
(gdb) list
476 /* Free buffers. */
477 while (elts_free != elts_tail) {
478 m = rte_pktmbuf_prefree_seg((*txq->elts)[elts_free++ & elts_m]);
479 if (likely(m != NULL)) {
480 if (likely(m->pool == pool)) {
481 free[blk_n++] = m;
482 } else {
483 if (likely(pool != NULL))
484 rte_mempool_put_bulk(pool,
485 (void *)free,
(gdb) print blk_n
$1 = 3241
(gdb) print elts_free
$2 = 916
(gdb) print elts_tail
$3 = 58585
(gdb) print elts_m
$4 = <optimized out>
(gdb) print *txq->elts
Cannot access memory at address 0x7ff88d88c408
I am working with Mellonox on another issue in the same area of the code when running testpmd on the host, it might be related. I will do more debugging to see if the two problems match.
Here is some some configuration data.
# rpm -qa | grep openvswitch
openvswitch2.10-2.10.0-10.el7fdb.1.ppc64le
openvswitch2.10-debuginfo-2.10.0-10.el7fdb.1.ppc64le
[root@ltc17u31 /]# rpm -qa | grep rdma-core
rdma-core-15-7.el7_5.ppc64le
rdma-core-devel-15-7.el7_5.ppc64le
[root@ltc17u31 /]# rpm -qa | grep ibacm
ibacm-15-7.el7_5.ppc64le
[root@ltc17u31 /]# rpm -qa | grep libibcm
libibcm-15-7.el7_5.ppc64le
[root@ltc17u31 /]# rpm -qa | grep libibumad
libibumad-15-7.el7_5.ppc64le
[root@ltc17u31 /]# rpm -qa | grep libibverbs
libibverbs-15-7.el7_5.ppc64le
libibverbs-utils-15-7.el7_5.ppc64le
[root@ltc17u31 /]# rpm -qa | grep librdmacm
librdmacm-15-7.el7_5.ppc64le
[root@ltc17u31 /]# uname -a
Linux ltc17u31 4.14.0-49.el7a.ppc64le #1 SMP Wed Mar 14 13:58:40 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
[root@ltc17u31 /]# ovs-vsctl show
2adc7953-e1bf-4ec7-b39b-42797a141e72
Bridge "ovs_pvp_br0"
fail_mode: secure
Port "dpdk0"
Interface "dpdk0"
type: dpdk
options: {dpdk-devargs="0000:01:00.0", n_rxq="2"}
Port "ovs_pvp_br0"
Interface "ovs_pvp_br0"
type: internal
Port "vhost0"
Interface "vhost0"
type: dpdkvhostuserclient
options: {n_rxq="2", vhost-server-path="/tmp/vhost-sock0"}
ovs_version: "2.10.0"
-
dmesg
[ 5636.703686] pmd18[12905]: unhandled signal 11 at 00007fff9ef20000 nip 00007fffa9e60ed0 lr 00007fffa9b1df40 code 1
Created attachment 1497254 [details]
ovs-vswitchd.log, from ovs-vswithd crash.
(In reply to David J. Wilder from comment #10) > I am hitting a ovs-vswitchd crash with openvswitch2.10 on p9. > The problem happens when running the PVP test. Alaa, need your help here. This is an issue with dpdk and the Mellanox pmd. It is being worked by Mellanox. RHEL is not currently supporting dpdk on power, so this issue is not blocking OVS. (In reply to David J. Wilder from comment #13) > This is an issue with dpdk and the Mellanox pmd. It is being worked by > Mellanox. RHEL is not currently supporting dpdk on power, so this issue is > not blocking OVS. Thanks David for the update. BTW, what is the Mellanox support case number ? Regards, Alaa FDB is not released using errata |
Created attachment 1488228 [details] included to work around the lack of off_t definition for mlx5dv.h. Description of problem: Please enable support for Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD in dpdk config for power. Version-Release number of selected component (if applicable): openvswitch 2.10 with dpdk 17.11 How reproducible: Run testpmd with a CX4 or CX5 adapter on any power system. Actual results: PMD is not supported. Please update the file ppc_64-power8-linuxapp-gcc-config to enable this support: <....> # Compile burst-oriented Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD CONFIG_RTE_LIBRTE_MLX5_PMD=y CONFIG_RTE_LIBRTE_MLX5_DEBUG=n CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=y CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8 <....> Apply the attached patch to correct build issue on power when MLX support is enabled.