Bug 1791483
Summary: | openmpi mpirun commad displays undefined symbol: uct_ep_create_connected (ignored) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Afom T. Michael <tmichael> |
Component: | openmpi | Assignee: | Honggang LI <honli> |
Status: | CLOSED ERRATA | QA Contact: | Afom T. Michael <tmichael> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.2 | CC: | rdma-dev-team |
Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
Target Release: | 8.2 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openmpi-4.0.2-2.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-04-28 16:57:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Afom T. Michael
2020-01-15 23:06:50 UTC
(In reply to Afom T. Michael from comment #0) > Description of problem: > On RHEL-8.2 (4.18.0-167.el8.x86_64), running openmpi mpirun command given in > step 3 of reproduce section displays "mca_base_component_repository_open: > unable to open mca_btl_uct: /usr/lib64/openmpi/lib/openmpi/mca_btl_uct.so: > undefined symbol: uct_ep_create_connected (ignored)". ^^^^^^^^^^^^^^^^^^^^^^^ 105 static inline ucs_status_t mca_btl_uct_ep_create_connected_compat (uct_iface_h iface, uct_device_addr_t *device_addr, 106 uct_iface_addr_t *iface_addr, uct_ep_h *uct_ep) 107 { 108 #if UCT_API >= UCT_VERSION(1, 6) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 109 uct_ep_params_t ep_params = {.field_mask = UCT_EP_PARAM_FIELD_IFACE | UCT_EP_PARAM_FIELD_DEV_ADDR | UCT_EP_PARAM_FIELD_IFACE_ADDR, 110 .iface = iface, .dev_addr = device_addr, .iface_addr = iface_addr}; 111 return uct_ep_create (&ep_params, uct_ep); 112 #else 113 return uct_ep_create_connected (iface, device_addr, iface_addr, uct_ep); 114 #endif 115 } The error message says uct_ep_create_connected was needed. That means openmpi had been compiled with UCT_API < UCT_VERSION(1, 6). The build log confirms openmpi was built with ucx-1.5. https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=988928 openmpi-4.0.2-1.el8 http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/openmpi/4.0.2/1.el8/data/logs/x86_64/root.log DEBUG util.py:439: ucx-devel x86_64 1.5.2-1.el8 build 106 k Rebuilt openmpi with in-box ucx >= 1.6 will get rid of the error message. After updating to openmpi-4.0.2-2.el8.x86_64, the '...mca_base_component_repository_open: ...' is gone. Marking verified. [root@rdma-dev-26 ~]$ rpm -q openmpi mpitests-openmpi openmpi-4.0.2-2.el8.x86_64 mpitests-openmpi-5.4.2-4.el8.x86_64 [root@rdma-dev-26 ~]$ timeout 3m /usr/lib64/openmpi/bin/mpirun --allow-run-as-root --map-by node -mca btl_openib_warn_nonexistent_if 0 -mca btl_openib_if_include bnxt_re0:1 -mca mtl '^psm2,psm,ofi' -mca btl '^openib,usnic' -hostfile /root/hfile_one_core -np 2 /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 PingPong #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018 Update 1, MPI-1 part #------------------------------------------------------------ # Date : Fri Jan 24 13:37:51 2020 # Machine : x86_64 # System : Linux # Release : 4.18.0-167.el8.x86_64 # Version : #1 SMP Sun Dec 15 01:24:23 UTC 2019 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 PingPong # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 9.49 0.00 1 1000 9.50 0.11 [...snip...] 4194304 10 1893.83 2214.72 # All processes entering MPI_Finalize [root@rdma-dev-26 ~]$ With openmpi-4.0.2-1.el8.x86_64, below is what was seen. [root@rdma-dev-26 ~]$ rpm -q openmpi mpitests-openmpi openmpi-4.0.2-1.el8.x86_64 mpitests-openmpi-5.4.2-4.el8.x86_64 [root@rdma-dev-26 ~]$ timeout 3m /usr/lib64/openmpi/bin/mpirun --allow-run-as-root --map-by node -mca btl_openib_warn_nonexistent_if 0 -mca btl_openib_if_include bnxt_re0:1 -mca mtl '^psm2,psm,ofi' -mca btl '^openib,usnic' -hostfile /root/hfile_one_core -np 2 /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 PingPong [rdma-dev-26.lab.bos.redhat.com:20922] mca_base_component_repository_open: unable to open mca_btl_uct: /usr/lib64/openmpi/lib/openmpi/mca_btl_uct.so: undefined symbol: uct_ep_create_connected (ignored) [rdma-dev-25.lab.bos.redhat.com:17345] mca_base_component_repository_open: unable to open mca_btl_uct: /usr/lib64/openmpi/lib/openmpi/mca_btl_uct.so: undefined symbol: uct_ep_create_connected (ignored) #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018 Update 1, MPI-1 part #------------------------------------------------------------ # Date : Fri Jan 24 13:36:09 2020 # Machine : x86_64 # System : Linux # Release : 4.18.0-167.el8.x86_64 # Version : #1 SMP Sun Dec 15 01:24:23 UTC 2019 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 PingPong # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 10.13 0.00 1 1000 9.91 0.10 [...snip...] 4194304 10 2170.53 1932.39 # All processes entering MPI_Finalize [root@rdma-dev-26 ~]$ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:1865 |