Bug 465843

Summary: can't run openmpi jobs using --mca btl udapl
Product: Red Hat Enterprise Linux 5 Reporter: Mehdi Bozzo-Rey <mbozzore>
Component: daplAssignee: Doug Ledford <dledford>
Status: CLOSED WONTFIX QA Contact: Martin Jenner <mjenner>
Severity: urgent Docs Contact:
Priority: medium    
Version: 5.2CC: fenlason, gozen, syeghiay
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-02 19:12:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mehdi Bozzo-Rey 2008-10-06 17:30:06 UTC
Description of problem:
error when running openmpi jobs using --mca btl udapl

Version-Release number of selected component (if applicable):RHEL 5.2 / compat-dapl-1.2.5-2.0.7-2.el5


Actual results:


Expected results:

[mbozzore@dr08 examples]$  mpirun -np 4 --machinefile ./hosts --mca btl udapl,self ./hello_c
Hello, world, I am 1 of 4
Hello, world, I am 0 of 4
Hello, world, I am 3 of 4
Hello, world, I am 2 of 4

Additional info:


with: cp /etc/ofed/dat.conf  /etc/dat.conf on all the nodes

with debug enabled:

[mbozzore@compute-00-02 ompi_udapl]$ mpirun -np 4 --machinefile ./hosts --mca btl udapl,self ./hello_c
DAT Registry: Started (dat_init)
DAT Registry: static registry file </etc/dat.conf>

DAT Registry: token
 type  string
 value <OpenIB-cma>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib0 0

DAT Registry: loading provider for OpenIB-cma

DAT Registry: token
 type  string
 value <OpenIB-cma-1>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib1 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-1
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib1 0

DAT Registry: loading provider for OpenIB-cma-1

DAT Registry: token
 type  string
 value <OpenIB-cma-2>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib2 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-2
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib2 0

DAT Registry: loading provider for OpenIB-cma-2

DAT Registry: token
 type  string
 value <OpenIB-cma-3>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib3 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-3
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib3 0

DAT Registry: loading provider for OpenIB-cma-3

DAT Registry: token
 type  string
 value <OpenIB-bond>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <bond0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-bond
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params bond0 0

DAT Registry: loading provider for OpenIB-bond

DAT Registry: token
 type  string
 value <ofa-v2-ib0>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib0
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib0 0

DAT Registry: loading provider for ofa-v2-ib0

DAT Registry: token
 type  string
 value <ofa-v2-ib1>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib1 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib1
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib1 0

DAT Registry: loading provider for ofa-v2-ib1

DAT Registry: token
 type  string
 value <ofa-v2-ib2>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib2 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib2
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib2 0

DAT Registry: loading provider for ofa-v2-ib2

DAT Registry: token
 type  string
 value <ofa-v2-ib3>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib3 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib3
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib3 0

DAT Registry: loading provider for ofa-v2-ib3

DAT Registry: token
 type  string
 value <ofa-v2-bond>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <bond0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-bond
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params bond0 0

DAT Registry: loading provider for ofa-v2-bond

DAT Registry: token
 type  eof
 value <>

DAT Registry: Started (dat_init)
DAT Registry: static registry file </etc/dat.conf>

DAT Registry: token
 type  string
 value <OpenIB-cma>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-3" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-bond" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib0" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a rea type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>

l error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host compute-00-03 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------

DAT Registry: token
 type  string
 value <ib0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib0 0

DAT Registry: loading provider for OpenIB-cma

DAT Registry: token
 type  string
 value <OpenIB-cma-1>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib1 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-1
 api_version
     type 0x0
     major.minor 1.2
DAT Registry: dat_registry_list_providers () called
DAT Registry: dat_ia_openv (OpenIB-cma,1:2,0) called
DAT Registry: IA OpenIB-cma, trying to load library libdaplcma.so.1
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params ib1 0

DAT Registry: loading provider for OpenIB-cma-1

DAT Registry: token
 type  string
 value <OpenIB-cma-2>


DAT Registry: token
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>

DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma not found in dynamic registry

DAT Registry: token
 type  string
 value <dapl.1.2>

DAT Registry: dat_ia_openv (OpenIB-cma-1,1:2,0) called
DAT Registry: IA OpenIB-cma-1, trying to load library libdaplcma.so.1
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------

DAT Registry: token
 type  string
 value <ib2 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-2
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-1 not found in dynamic registry
 ia_params ib2 0

DAT Registry: loading provider for OpenIB-cma-2

DAT Registry: token
 type  string
 value <OpenIB-cma-3>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-cma-2,1:2,0) called
DAT Registry: IA OpenIB-cma-2, trying to load library libdaplcma.so.1
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <ib3 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-cma-3
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-2 not found in dynamic registry
     id dapl
     major.minor 1.2
 ia_params ib3 0

DAT Registry: loading provider for OpenIB-cma-3

DAT Registry: token
 type  string
 value <OpenIB-bond>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
 type  string
 value <u1.2>


DAT Registry: token
 type  string
 value <nonthreadsafe>

DAT Registry: dat_ia_openv (OpenIB-cma-3,1:2,0) called
DAT Registry: IA OpenIB-cma-3, trying to load library libdaplcma.so.1

DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplcma.so.1>


DAT Registry: token
 type  string
 value <dapl.1.2>


DAT Registry: token
 type  string
 value <bond0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name OpenIB-bond
 api_version
     type 0x0
     major.minor 1.2
 is_thread_safe 0
 is_default 1
 lib_path libdaplcma.so.1
 provider_version
     id dapl
     major.minor 1.2
 ia_params bond0 0

DAT Registry: loading provider for OpenIB-bond
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------

DAT Registry: token
 type  string
 value <ofa-v2-ib0>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-3 not found in dynamic registry
 type  string
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-3" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-bond,1:2,0) called
DAT Registry: IA OpenIB-bond, trying to load library libdaplcma.so.1
 type  string
 value <ib0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-3" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
 ia_name ofa-v2-ib0
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib0 0

DAT Registry: loading provider for ofa-v2-ib0

DAT Registry: token
 type  string
 value <ofa-v2-ib1>


DAT Registry: token
 type  string
 value <u2.0>

--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-bond" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------

DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib0" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-bond not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib1 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-bond" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,3]: uDAPL on host compute-00-03 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib0,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib0 not found in dynamic registry
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib1
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib1 0

DAT Registry: loading provider for ofa-v2-ib1

DAT Registry: token
 type  string
 value <ofa-v2-ib2>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>

--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib0" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib1,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib1 not found in dynamic registry

DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>

--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib2,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib2 not found in dynamic registry

DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib2 0>


DAT Registry: token
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host compute-00-02 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT Registry: Stopped (dat_fini)
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib2
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib2 0

DAT Registry: loading provider for ofa-v2-ib2

DAT Registry: token
 type  string
 value <ofa-v2-ib3>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <ib3 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-ib3
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params ib3 0

DAT Registry: loading provider for ofa-v2-ib3

DAT Registry: token
 type  string
 value <ofa-v2-bond>


DAT Registry: token
 type  string
 value <u2.0>


DAT Registry: token
 type  string
 value <nonthreadsafe>


DAT Registry: token
 type  string
 value <default>


DAT Registry: token
 type  string
 value <libdaplofa.so.2>


DAT Registry: token
 type  string
 value <dapl.2.0>


DAT Registry: token
 type  string
 value <bond0 0>


DAT Registry: token
 type  string
 value <>


DAT Registry: token
 type  eor
 value <>


DAT Registry: entry
 ia_name ofa-v2-bond
 api_version
     type 0x0
     major.minor 2.0
 is_thread_safe 0
 is_default 1
 lib_path libdaplofa.so.2
 provider_version
     id dapl
     major.minor 2.0
 ia_params bond0 0

DAT Registry: loading provider for ofa-v2-bond

DAT Registry: token
 type  eof
 value <>

DAT Registry: dat_registry_list_providers () called
DAT Registry: dat_ia_openv (OpenIB-cma,1:2,0) called
DAT Registry: IA OpenIB-cma, trying to load library libdaplcma.so.1
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-cma-1,1:2,0) called
DAT Registry: IA OpenIB-cma-1, trying to load library libdaplcma.so.1
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-1 not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-cma-2,1:2,0) called
DAT Registry: IA OpenIB-cma-2, trying to load library libdaplcma.so.1
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-2 not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-cma-3,1:2,0) called
DAT Registry: IA OpenIB-cma-3, trying to load library libdaplcma.so.1
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-cma-3 not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-3" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (OpenIB-bond,1:2,0) called
DAT Registry: IA OpenIB-bond, trying to load library libdaplcma.so.1
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider
DAT Registry: static registry unable to load library libdaplcma.so.1
DAT Registry: dat_ia_open () provider information for IA name OpenIB-bond not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-bond" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib0,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib0 not found in dynamic registry
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib0" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib1,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib1 not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT Registry: dat_ia_openv (ofa-v2-ib2,1:2,0) called
DAT Registry: dat_ia_open () provider information for IA name ofa-v2-ib2 not found in dynamic registry
--------------------------------------------------------------------------

WARNING: Failed to open "ofa-v2-ib2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,2]: uDAPL on host compute-00-02 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT Registry: Stopped (dat_fini)
--------------------------------------------------------------------------
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)




with no /etc/dat.conf:

[mbozzore@compute-00-02 ompi_udapl]$ mpirun -np 4 --machinefile ./hosts --mca btl udapl,self ./hello_c
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host compute-00-02 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,2]: uDAPL on host compute-00-02 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host compute-00-03 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,3]: uDAPL on host compute-00-03 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 2 with PID 9087 on node compute-00-02 exited on signal 1 (Hangup).

Comment 3 Doug Ledford 2012-08-02 19:12:26 UTC
This bug got overlooked for a long time.  The problem now is that, after talks with the openmpi upstream, it was determined that openmpi over udapl was mainly a Solaris thing that the openmpi upstream didn't even really want to support on anything else.  And the later openmpi binaries don't even include a udapl btl.  As such, I'm closing this bug out as wontfix.