RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1708656 - Intermittent failures with mvapich2 running over PSM2
Summary: Intermittent failures with mvapich2 running over PSM2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: mvapich2
Version: 8.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 8.0
Assignee: Jarod Wilson
QA Contact: Afom T. Michael
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-10 13:41 UTC by Michael Heinz
Modified: 2020-12-20 07:20 UTC (History)
6 users (show)

Fixed In Version: mvapich2-2.3.2-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 16:56:49 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
diff between mvapich 2.3b and 2.3-5 (367.09 KB, application/octet-stream)
2019-05-10 16:07 UTC, Michael Heinz
no flags Details
source code for mvapich2 as compiled by Intel (8.50 MB, application/octet-stream)
2019-05-10 16:08 UTC, Michael Heinz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:1865 0 None None None 2020-04-28 16:57:43 UTC

Description Michael Heinz 2019-05-10 13:41:44 UTC
Description of problem:

When running the provided mpitests-mvapich2-psm package, we see 

Version-Release number of selected component (if applicable):

mvapich2-psm2-2.3-5.el8.x86_64
mpitests-mvapich2-psm2-5.4.2-4.el8.x86_64
libpsm2-compat-11.2.80-1.x86_64
libpsm2-11.2.80-1.x86_64
libpsm2-debuginfo-11.2.80-1.x86_64


How reproducible:

when testing the mpitests-IMB-ext application, we intermittently see a crash.

Steps to Reproduce:

/usr/lib64/mvapich2-psm2/bin/mpirun -env HFI_UNIT=0 -env IPATH_UNIT=0 -env MV2_IBA_HCA=hfi1_0 -env MV2_DEFAULT_PORT=1 -np 2 -hosts phkpstl001,phkpstl002 -env LD_LIBRARY_PATH=/usr/lib64/mvapich2-psm2/lib/ /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT

Actual results:

[phkpstl002.ph.intel.com:mpi_rank_1][error_sighandler] Caught error: Bus error (signal 7)
[proxy:0:0.intel.com] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:911): assert (!closed) failed
[proxy:0:0.intel.com] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:0.intel.com] main (pm/pmiserv/pmip.c:202): demux engine error waiting for event
[mpiexec.intel.com] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec.intel.com] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec.intel.com] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for completion
[mpiexec.intel.com] main (ui/mpich/mpiexec.c:340): process manager error waiting for completion
[phkpstl002.ph.intel.com:mpi_rank_1][error_sighandler] Caught error: Bus error (signal 7)
[proxy:0:0.intel.com] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:911): assert (!closed) failed
[proxy:0:0.intel.com] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:0.intel.com] main (pm/pmiserv/pmip.c:202): demux engine error waiting for event
[mpiexec.intel.com] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec.intel.com] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec.intel.com] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for completion
[mpiexec.intel.com] main (ui/mpich/mpiexec.c:340): process manager error waiting for completion

(output trimmed)
#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        11.81         0.00
            4          100        18.59         0.22
            8          100        18.99         0.42
           16          100        18.50         0.86
           32          100        18.69         1.71
           64          100        18.71         3.42
          128          100        18.62         6.88
          256          100        19.33        13.24

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 86353 RUNNING AT phkpstl002
=   EXIT CODE: 135
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[
Expected results:

(output trimmed)
#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        11.61         0.00
            4          100        18.86         0.21
            8          100        18.62         0.43
           16          100        18.54         0.86
           32          100        19.01         1.68
           64          100        18.96         3.38
          128          100        18.84         6.79
          256          100        19.00        13.48
          512          100        19.25        26.59
         1024          100        19.67        52.07
         2048          100        20.58        99.51
         4096          100        21.59       189.69
         8192          100        20.54       398.84
        16384          100        54.23       302.10
        32768          100        51.55       635.67
        65536          100        62.60      1046.83
       131072          100        74.09      1769.18
       262144          100        94.78      2765.93
       524288           80       138.75      3778.72
      1048576           40       220.01      4765.98
      2097152           20       399.60      5248.11
      4194304           10       703.60      5961.23
(output trimmed)


Additional info:

1. Problem does not occur with mvapich2-2.3b as compiled by Intel. 
2. Problem does not appear to be in the libpsm2 library provided by RHEL because replacing it with the Intel version does not resolve the issue.
3. Problem does not occur with every run (usually every 5-10 runs) but does always appear to happen in the BiDir_Get test.

Comment 1 Don Dutile (Red Hat) 2019-05-10 15:10:20 UTC
re:
1. Problem does not occur with mvapich2-2.3b as compiled by Intel. 
2. Problem does not appear to be in the libpsm2 library provided by RHEL because replacing it with the Intel version does not resolve the issue.
3. Problem does not occur with every run (usually every 5-10 runs) but does always appear to happen in the BiDir_Get test.

Does RH have access to the Intel mapich2-2.3b sources?
-- if not, can you provide diffs &/or commits btwn RHEL's 2-2.3-5 version and Intel's 2-2.3b version?

thanks.

Comment 2 Michael Heinz 2019-05-10 16:06:09 UTC
From the changelog, the 2-2.3b release of MVAPICH2 was in 08/10/2017. However, I do not know if it is still available from OSU, so I will add the srpm as an attachment.

Comment 3 Michael Heinz 2019-05-10 16:07:14 UTC
Created attachment 1566724 [details]
diff between mvapich 2.3b and 2.3-5

Comment 4 Michael Heinz 2019-05-10 16:08:17 UTC
Created attachment 1566725 [details]
source code for mvapich2 as compiled by Intel

Comment 5 Michael Heinz 2019-05-10 16:10:31 UTC
Comment on attachment 1566725 [details]
source code for mvapich2 as compiled by Intel

This attachment is truncated.

Comment 6 Michael Heinz 2019-05-10 16:11:30 UTC
Sorry, looks like I can't upload the full source.

Comment 7 Jarod Wilson 2019-06-19 19:09:47 UTC
Full source not required, we keep older source rpms and tarballs around in our build system. That said... This probably needs to be taken upstream to the mvapich2 folks. For the most part, we just package upstream here, we don't actually do much of anything in the way of development work. Upstream may see the report and know immediately what broke and how to fix it. Can you try chasing there? I'm also seeing an mvapich2 2.3.1 release now, which may potentially already carry a fix, so if you could double-check that as well, maybe a fix here is as simple as an update to 2.3.1.

Comment 8 Jarod Wilson 2019-06-19 19:14:33 UTC
Highlights from the MVAPICH2 changelog:

MVAPICH2 2.3.1 (03/01/2019)

...
* Bug Fixes (since 2.3):
...
    - Fix issues with MPI-3 shared memory windows for PSM-CH3 and PSM2-CH3
      channel
        - Thanks to Adam Moody @LLNL for the report
...
    - Fix issues with MPI_Mprobe/Improbe and MPI_Mrecv/Imrecv for PSM-CH3 and
      PSM2-CH3 channel
        - Thanks to Adam Moody @LLNL for the report

Wondering if either of these PSM-specific fixes is relevant here.

Comment 9 Jarod Wilson 2019-07-23 15:35:56 UTC
Still looking for some feedback here, but I think we'll go ahead with mvapich2 2.3.1 for RHEL-8.2, in hopes it fixes this.

Comment 10 Michael Heinz 2019-07-31 13:49:22 UTC
(In reply to Jarod Wilson from comment #9)
> Still looking for some feedback here, but I think we'll go ahead with
> mvapich2 2.3.1 for RHEL-8.2, in hopes it fixes this.

Sorry - I've been on sabbatical for the past 2 months. I will try the new version as soon as I can.

Comment 11 Michael Heinz 2019-08-09 14:11:40 UTC
Hey @Jarod - would it be possible to get a copy of the RHEL version of mvapich2 2.3.1 to test? I looked around, did not find it.

I'd probably have to install it on a RHEL 8.0 system.

Comment 12 Jarod Wilson 2019-08-15 18:47:06 UTC
(In reply to Michael Heinz from comment #11)
> Hey @Jarod - would it be possible to get a copy of the RHEL version of
> mvapich2 2.3.1 to test? I looked around, did not find it.
> 
> I'd probably have to install it on a RHEL 8.0 system.

It's not been formally pushed out the door yet, but here's a copy of the pending SRPM:

http://people.redhat.com/~jwilson/misc/mvapich2-2.3.1-1.el8.src.rpm

Comment 13 Michael Heinz 2019-08-19 12:45:13 UTC
So, when I build mvapich2 2.3.1 for psm2 using our usual configuration settings it works fine. But when I rebuilt this SRPM using the default arguments, it still crashes:

[root@phkpstl037 bin]# /usr/lib64/mvapich2-psm2/bin/mpirun -env HFI_UNIT=0 -env IPATH_UNIT=0 -env MV2_IBA_HCA=hfi1_0 -env MV2_DEFAULT_PORT=1 -np 2 -hosts phkpstl037,phkpstl038 -env LD_LIBRARY_PATH=/usr/lib64/mvapich2-psm2/lib/ /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018 Update 1, MPI-2 part    
#------------------------------------------------------------
# Date                  : Mon Aug 19 08:32:48 2019
# Machine               : x86_64
# System                : Linux
# Release               : 4.18.0-80.el8.x86_64
# Version               : #1 SMP Wed Mar 13 12:02:46 UTC 2019
# MPI Version           : 3.1
# MPI Thread Environment: 


# Calling sequence was: 

# /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE 
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# Window
# Unidir_Get
# Unidir_Put
# Bidir_Get
# Bidir_Put
# Accumulate

#----------------------------------------------------------------
# Benchmarking Window 
# #processes = 2 
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0          100       301.65       301.67       301.66
            4          100       296.96       296.96       296.96
            8          100       286.37       286.37       286.37
           16          100       286.33       286.34       286.33
           32          100       294.80       294.80       294.80
           64          100       295.42       295.43       295.43
          128          100       292.73       292.73       292.73
          256          100       292.10       292.11       292.10
          512          100       293.44       293.44       293.44
         1024          100       293.39       293.40       293.40
         2048          100       325.94       325.95       325.95
         4096          100       315.70       315.70       315.70
         8192          100       320.24       320.24       320.24
        16384          100       320.07       320.08       320.07
        32768          100       316.40       316.48       316.44
[cli_0]: aborting job:
Fatal error in PMPI_Comm_create: Other MPI error, error stack:
PMPI_Comm_create(582)...............: MPI_Comm_create(comm=0xc40003ed, group=0xc8000002, new_comm=0x55934784895c) failed
PMPI_Comm_create(559)...............: 
MPIR_Comm_create_intra(228).........: 
MPIR_Get_contextid_sparse_group(614): Too many communicators (0/2048 free on this process; ignore_id=0)
[cli_1]: aborting job:
Fatal error in PMPI_Comm_create: Other MPI error, error stack:
PMPI_Comm_create(582)...............: MPI_Comm_create(comm=0xc40003ed, group=0xc8000002, new_comm=0x5563412cba1c) failed
PMPI_Comm_create(559)...............: 
MPIR_Comm_create_intra(228).........: 
MPIR_Get_contextid_sparse_group(614): Too many communicators (0/2048 free on this process; ignore_id=0)

Installed packages are:

mvapich2-2.3.1-1.el8.x86_64
mpitests-mvapich2-psm2-5.4.2-4.el8.x86_64

Comment 14 Michael Heinz 2019-08-19 12:46:30 UTC
> Installed packages are:
> 
> mvapich2-2.3.1-1.el8.x86_64
> mpitests-mvapich2-psm2-5.4.2-4.el8.x86_64

Sorry, that should say

mvapich2-psm2-2.3.1-1.el8.x86_64

Comment 15 Michael Heinz 2019-08-19 12:50:34 UTC
Configure command generated by rpmbuild --rebuild mvapich2-2.3.1-1.el8.src.rpm 2>&1 | tee make.log

./configure \
--prefix=/usr/lib64/mvapich2-psm2 \
--exec-prefix=/usr/lib64/mvapich2-psm2 \
--bindir=/usr/lib64/mvapich2-psm2/bin \
--sbindir=/usr/lib64/mvapich2-psm2/bin \
--libdir=/usr/lib64/mvapich2-psm2/lib \
--mandir=/usr/share/man/mvapich2-psm2-x86_64 \
--includedir=/usr/include/mvapich2-psm2-x86_64 \
--sysconfdir=/etc/mvapich2-psm2-x86_64 \
--datarootdir=/usr/share/mvapich2-psm2 \
--docdir=/usr/share/doc/mvapich2 \
--enable-error-checking=runtime \
--enable-timing=none \
--enable-g=mem,dbg,meminit \
--enable-fast=all \
--enable-shared \
--enable-static \
--enable-fortran=all \
--enable-cxx \
--disable-silent-rules \
--disable-wrapper-rpath \
--with-hwloc-prefix=system \
--with-device=ch3:psm \
--with-ftb=no \
--with-blcr=no \
--with-fuse=no CC=gcc 'CFLAGS=-m64 -O3 -fno-strict-aliasing -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection ' CXX=g++ 'CXXFLAGS=-m64 -O3 -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection ' FC=gfortran 'FCFLAGS=-m64 -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection ' F77=gfortran 'FFLAGS=-m64 -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection ' 'LDFLAGS=-Wl,-z,relro  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld'

Comment 16 Michael Heinz 2019-08-19 13:02:34 UTC
I'm double checking that my by-hand builds of 2.3.2 work as expected.

Comment 17 Michael Heinz 2019-08-20 18:09:30 UTC
Okay, I've tried this several different way - but the result is that building MVAPICH2 2.3.2 using the above configuration produces a working version of MVAPICH2, but compiling 2.3.1 with the same config, or variations of that config, creates a broken MPI that fails.

So, my recommendation is that you use mvapich2 2.3.2 and dump 2.3.1.

Comment 18 Jarod Wilson 2019-08-21 17:17:49 UTC
I hadn't actually noticed that 2.3.2 was just released. The changelog doesn't really mention anything that looks directly related, but if it works, it works, and yeah, I've got no problem rolling forward to 2.3.2 now. I'll get my internal tree updated accordingly.

Comment 20 Afom T. Michael 2019-12-19 21:23:24 UTC
With recent RHEL-8.2 build (4.18.0-167.el8.x86_64), tests passed as shown below.

[root@rdma-qe-15 ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
[root@rdma-qe-15 ~]$ uname -r
4.18.0-167.el8.x86_64
[root@rdma-qe-15 ~]$ rpm -qa | grep -E "rdma|mvapich2|mpitests-mvapich2-psm2"
mvapich2-psm2-2.3.2-2.el8.x86_64
rdma-core-devel-26.0-7.el8.x86_64
rdma-core-26.0-7.el8.x86_64
librdmacm-26.0-7.el8.x86_64
librdmacm-utils-26.0-7.el8.x86_64
mpitests-mvapich2-psm2-5.4.2-4.el8.x86_64
[root@rdma-qe-15 ~]$ ibstatus 
Infiniband device 'hfi1_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:0011:7501:0167:0fb0
	base lid:	 0x5
	sm lid:		 0x4
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 100 Gb/sec (4X EDR)
	link_layer:	 InfiniBand

[root@rdma-qe-15 ~]$ ip address show hfi1_opa0
6: hfi1_opa0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UP group default qlen 256
    link/infiniband 80:00:00:02:fe:80:00:00:00:00:00:00:00:11:75:01:01:67:0f:b0 brd 00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.20.15/24 brd 172.31.20.255 scope global noprefixroute hfi1_opa0
       valid_lft forever preferred_lft forever
    inet6 fe80::211:7501:167:fb0/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@rdma-qe-15 ~]$ ssh rdma-qe-14 "ip address show hfi1_opa0"
6: hfi1_opa0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UP group default qlen 256
    link/infiniband 80:00:00:02:fe:80:00:00:00:00:00:00:00:11:75:01:01:67:10:f0 brd 00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.20.14/24 brd 172.31.20.255 scope global noprefixroute hfi1_opa0
       valid_lft forever preferred_lft forever
    inet6 fe80::211:7501:167:10f0/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@rdma-qe-15 ~]$
[root@rdma-qe-15 ~]$ /usr/lib64/mvapich2-psm2/bin/mpirun -env HFI_UNIT=0 -env IPATH_UNIT=0 -env MV2_IBA_HCA=hfi1_0 -env MV2_DEFAULT_PORT=1 -np 2 -hosts 172.31.20.14,172.31.20.15 -env LD_LIBRARY_PATH=/usr/lib64/mvapich2-psm2/lib/ /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018 Update 1, MPI-2 part    
#------------------------------------------------------------
# Date                  : Thu Dec 19 16:13:30 2019
# Machine               : x86_64
# System                : Linux
# Release               : 4.18.0-167.el8.x86_64
# Version               : #1 SMP Sun Dec 15 01:24:23 UTC 2019
# MPI Version           : 3.1
# MPI Thread Environment: 


# Calling sequence was: 

# /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE 
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# Window
# Unidir_Get
# Unidir_Put
# Bidir_Get
# Bidir_Put
# Accumulate

#----------------------------------------------------------------
# Benchmarking Window 
# #processes = 2 
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0          100        97.36        97.36        97.36
            4          100        95.69        95.69        95.69
            8          100        94.82        94.82        94.82
           16          100        95.08        95.08        95.08
           32          100        94.12        94.12        94.12
           64          100        94.38        94.38        94.38
          128          100        94.80        94.80        94.80
          256          100        94.40        94.44        94.42
          512          100        94.05        94.09        94.07
         1024          100        93.74        93.74        93.74
         2048          100        93.75        93.75        93.75
         4096          100        94.13        94.13        94.13
         8192          100        94.34        94.37        94.35
        16384          100        95.04        95.05        95.04
        32768          100        93.93        93.94        93.94
        65536          100        94.25        94.26        94.26
       131072          100        94.61        94.61        94.61
       262144          100        94.71        94.71        94.71
       524288           80        94.05        94.05        94.05
      1048576           40        95.67        95.68        95.67
      2097152           20        98.06        98.06        98.06
      4194304           10        97.89        97.92        97.91

#---------------------------------------------------
# Benchmarking Unidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         0.93         4.30
            8         1000         0.98         8.19
           16         1000         0.97        16.54
           32         1000         0.97        32.94
           64         1000         0.97        66.18
          128         1000         0.97       131.49
          256         1000         1.00       256.51
          512         1000         1.03       499.18
         1024         1000         1.08       946.44
         2048         1000         1.16      1771.12
         4096         1000         1.46      2813.14
         8192         1000         2.57      3183.82
        16384         1000         4.48      3659.18
        32768         1000         5.50      5956.18
        65536          640         5.54     11831.45
       131072          320        10.77     12172.00
       262144          160        21.62     12125.02
       524288           80        43.15     12149.30
      1048576           40        85.77     12226.14
      2097152           20       170.85     12274.76
      4194304           10       340.51     12317.73

#---------------------------------------------------
# Benchmarking Unidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100         9.65         0.00
            4          100        11.86         0.34
            8          100        12.01         0.67
           16          100        11.56         1.38
           32          100        11.61         2.76
           64          100        11.56         5.54
          128          100        11.68        10.96
          256          100        11.63        22.02
          512          100        11.83        43.30
         1024          100        12.09        84.66
         2048          100        12.45       164.53
         4096          100        13.38       306.07
         8192          100        15.04       544.53
        16384          100        19.60       835.90
        32768          100        21.21      1544.95
        65536          100        31.86      2056.85
       131072          100        37.68      3478.37
       262144          100        48.54      5400.88
       524288           80        65.24      8036.26
      1048576           40       108.44      9669.22
      2097152           20       197.70     10607.93
      4194304           10       366.26     11451.76

#---------------------------------------------------
# Benchmarking Unidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         0.93         4.31
            8         1000         0.92         8.74
           16         1000         0.96        16.67
           32         1000         0.96        33.23
           64         1000         0.98        65.14
          128         1000         0.97       131.39
          256         1000         1.00       257.06
          512         1000         1.02       499.88
         1024         1000         1.11       919.10
         2048         1000         1.23      1662.46
         4096         1000         1.51      2711.04
         8192         1000         2.06      3982.35
        16384         1000        10.81      1516.22
        32768         1000        14.68      2231.55
        65536          640         8.11      8085.02
       131072          320        10.65     12303.09
       262144          160        22.16     11828.27
       524288           80        42.53     12327.23
      1048576           40        85.29     12293.63
      2097152           20       170.70     12285.90
      4194304           10       341.20     12292.77

#---------------------------------------------------
# Benchmarking Unidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        10.22         0.00
            4          100        16.16         0.25
            8          100        15.97         0.50
           16          100        15.98         1.00
           32          100        16.00         2.00
           64          100        16.06         3.99
          128          100        15.98         8.01
          256          100        16.01        15.99
          512          100        16.16        31.69
         1024          100        16.48        62.12
         2048          100        16.69       122.71
         4096          100        17.70       231.38
         8192          100        19.39       422.47
        16384          100        25.04       654.41
        32768          100        26.58      1232.86
        65536          100        40.48      1619.12
       131072          100        46.09      2843.61
       262144          100        56.86      4610.50
       524288           80        78.68      6663.71
      1048576           40       115.84      9051.81
      2097152           20       206.27     10167.13
      4194304           10       375.94     11156.89

#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.04         0.00
            4         1000         2.54         1.57
            8         1000         2.03         3.93
           16         1000         2.07         7.74
           32         1000         2.10        15.20
           64         1000         2.11        30.39
          128         1000         2.14        59.75
          256         1000         2.21       115.59
          512         1000         2.37       216.44
         1024         1000         2.56       400.31
         2048         1000         2.94       695.60
         4096         1000         3.98      1028.98
         8192         1000         6.47      1265.97
        16384         1000         6.82      2401.18
        32768         1000         7.26      4516.41
        65536          640         8.30      7894.54
       131072          320        13.10     10008.64
       262144          160        25.19     10407.73
       524288           80        48.40     10832.63
      1048576           40        88.54     11842.60
      2097152           20       175.69     11936.62
      4194304           10       343.20     12221.04

#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        11.28         0.00
            4          100        18.00         0.22
            8          100        18.99         0.42
           16          100        18.80         0.85
           32          100        18.87         1.70
           64          100        18.70         3.42
          128          100        18.92         6.77
          256          100        18.92        13.53
          512          100        19.01        26.93
         1024          100        19.37        52.85
         2048          100        19.77       103.61
         4096          100        20.49       199.93
         8192          100        18.86       434.38
        16384          100        30.57       536.03
        32768          100        32.55      1006.58
        65536          100        43.85      1494.39
       131072          100        52.09      2516.16
       262144          100        58.29      4497.16
       524288           80        84.22      6224.90
      1048576           40       130.06      8062.41
      2097152           20       215.69      9723.20
      4194304           10       394.68     10627.15

#---------------------------------------------------
# Benchmarking Bidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         1.95         2.05
            8         1000         1.93         4.14
           16         1000         2.02         7.91
           32         1000         2.06        15.50
           64         1000         2.07        30.96
          128         1000         2.12        60.37
          256         1000         2.15       119.29
          512         1000         2.25       227.75
         1024         1000         2.50       409.36
         2048         1000         2.92       700.36
         4096         1000         3.89      1051.85
         8192         1000         6.34      1292.06
        16384         1000        23.26       704.46
        32768         1000        32.51      1007.96
        65536          640        12.68      5168.55
       131072          320        17.61      7442.33
       262144          160        31.25      8389.62
       524288           80        51.61     10157.74
      1048576           40        95.10     11026.13
      2097152           20       180.63     11610.47
      4194304           10       353.43     11867.37

#---------------------------------------------------
# Benchmarking Bidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        11.58         0.00
            4          100        18.42         0.22
            8          100        18.10         0.44
           16          100        18.48         0.87
           32          100        18.36         1.74
           64          100        18.45         3.47
          128          100        18.78         6.82
          256          100        18.83        13.59
          512          100        19.00        26.94
         1024          100        19.18        53.38
         2048          100        19.73       103.81
         4096          100        21.52       190.30
         8192          100        23.53       348.09
        16384          100        40.56       403.95
        32768          100        41.62       787.30
        65536          100        51.94      1261.66
       131072          100        57.50      2279.35
       262144          100        68.71      3815.10
       524288           80        92.43      5672.52
      1048576           40       135.24      7753.28
      2097152           20       223.60      9379.00
      4194304           10       399.85     10489.65

#----------------------------------------------------------------
# Benchmarking Accumulate 
# #processes = 2 
#----------------------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.03         0.03         0.03
            4         1000         0.91         0.95         0.93
            8         1000         0.92         0.96         0.94
           16         1000         0.98         1.02         1.00
           32         1000         0.99         1.03         1.01
           64         1000         0.99         1.03         1.01
          128         1000         1.00         1.04         1.02
          256         1000         1.01         1.06         1.04
          512         1000         1.08         1.13         1.10
         1024         1000         1.25         1.25         1.25
         2048         1000         1.62         1.63         1.63
         4096         1000         2.43         2.44         2.44
         8192         1000         3.96         3.97         3.96
        16384         1000        18.03        18.04        18.04
        32768         1000        28.18        28.19        28.19
        65536          640        28.63        28.65        28.64
       131072          320        50.29        50.31        50.30
       262144          160        90.67        90.71        90.69
       524288           80       182.80       182.88       182.84
      1048576           40       364.79       364.89       364.84
      2097152           20       733.16       733.39       733.27
      4194304           10      1548.22      1567.03      1557.62

#----------------------------------------------------------------
# Benchmarking Accumulate 
# #processes = 2 
#----------------------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0          100        11.60        11.75        11.68
            4          100        22.80        22.90        22.85
            8          100        21.88        21.98        21.93
           16          100        17.89        17.91        17.90
           32          100        17.86        17.87        17.86
           64          100        17.83        17.85        17.84
          128          100        18.07        18.10        18.08
          256          100        17.93        17.96        17.94
          512          100        18.22        18.25        18.24
         1024          100        18.51        18.53        18.52
         2048          100        18.96        18.99        18.97
         4096          100        21.35        21.37        21.36
         8192          100        24.33        24.35        24.34
        16384          100        31.88        31.90        31.89
        32768          100        35.42        35.44        35.43
        65536          100        54.98        55.01        54.99
       131072          100        78.90        78.93        78.92
       262144          100       124.07       124.10       124.09
       524288           80       206.81       206.84       206.82
      1048576           40       343.62       343.68       343.65
      2097152           20       674.97       675.44       675.21
      4194304           10      1569.87      1577.50      1573.68


# All processes entering MPI_Finalize

[0] 24 at [0x000055af6d8f3f80], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6e345fd0], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6df240a0], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6d8e0590], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6d8f3ec0], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6d8c5e90], src/mpi/group/grouputil.c[74]
[0] 24 at [0x000055af6d8c5ab0], src/mpi/group/grouputil.c[74]
[root@rdma-qe-15 ~]$ echo $?
0
[root@rdma-qe-15 ~]$ /usr/lib64/mvapich2-psm2/bin/mpirun -env HFI_UNIT=0 -env IPATH_UNIT=0 -env MV2_IBA_HCA=hfi1_0 -env MV2_DEFAULT_PORT=1 -np 2 -hosts rdma-qe-14,rdma-qe-15 -env LD_LIBRARY_PATH=/usr/lib64/mvapich2-psm2/lib/ /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018 Update 1, MPI-2 part    
#------------------------------------------------------------
# Date                  : Thu Dec 19 16:02:20 2019
# Machine               : x86_64
# System                : Linux
# Release               : 4.18.0-167.el8.x86_64
# Version               : #1 SMP Sun Dec 15 01:24:23 UTC 2019
# MPI Version           : 3.1
# MPI Thread Environment: 


# Calling sequence was: 

# /usr/lib64/mvapich2-psm2/bin/mpitests-IMB-EXT

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE 
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# Window
# Unidir_Get
# Unidir_Put
# Bidir_Get
# Bidir_Put
# Accumulate

#----------------------------------------------------------------
# Benchmarking Window 
# #processes = 2 
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0          100        94.68        94.68        94.68
            4          100        93.96        93.96        93.96
            8          100        92.64        92.64        92.64
           16          100        92.18        92.18        92.18
           32          100        91.74        91.74        91.74
           64          100        91.62        91.63        91.62
          128          100        91.85        91.85        91.85
          256          100        91.81        91.82        91.81
          512          100        91.72        91.73        91.73
         1024          100        91.93        91.93        91.93
         2048          100        91.60        91.60        91.60
         4096          100        92.49        92.52        92.51
         8192          100        91.85        91.86        91.85
        16384          100        91.60        91.61        91.61
        32768          100        91.14        91.15        91.14
        65536          100        91.53        91.54        91.53
       131072          100        91.75        91.78        91.76
       262144          100        91.75        91.75        91.75
       524288           80        91.53        91.53        91.53
      1048576           40        91.96        91.99        91.98
      2097152           20        95.86        95.88        95.87
      4194304           10        94.41        94.84        94.63

#---------------------------------------------------
# Benchmarking Unidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         0.93         4.29
            8         1000         0.94         8.48
           16         1000         0.97        16.53
           32         1000         0.97        32.96
           64         1000         0.97        65.87
          128         1000         0.97       131.78
          256         1000         0.98       260.24
          512         1000         1.04       492.54
         1024         1000         1.09       940.44
         2048         1000         1.16      1767.11
         4096         1000         1.42      2893.70
         8192         1000         2.59      3156.90
        16384         1000         4.48      3659.38
        32768         1000         5.53      5927.41
        65536          640         5.54     11823.50
       131072          320        10.77     12174.52
       262144          160        21.55     12163.58
       524288           80        43.08     12171.15
      1048576           40        85.76     12226.99
      2097152           20       170.54     12297.07
      4194304           10       341.06     12297.93

#---------------------------------------------------
# Benchmarking Unidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100         9.19         0.00
            4          100        11.64         0.34
            8          100        11.73         0.68
           16          100        11.36         1.41
           32          100        11.38         2.81
           64          100        11.51         5.56
          128          100        11.51        11.12
          256          100        11.46        22.35
          512          100        11.67        43.88
         1024          100        12.13        84.41
         2048          100        12.27       166.89
         4096          100        13.25       309.05
         8192          100        14.94       548.18
        16384          100        19.60       835.90
        32768          100        21.06      1555.80
        65536          100        31.79      2061.48
       131072          100        37.43      3502.08
       262144          100        47.84      5479.75
       524288           80        65.14      8048.77
      1048576           40       108.14      9696.40
      2097152           20       198.26     10577.95
      4194304           10       367.45     11414.60

#---------------------------------------------------
# Benchmarking Unidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         0.90         4.42
            8         1000         0.93         8.58
           16         1000         0.93        17.13
           32         1000         0.95        33.51
           64         1000         0.97        66.04
          128         1000         0.98       130.53
          256         1000         1.00       257.25
          512         1000         1.03       495.38
         1024         1000         1.09       935.93
         2048         1000         1.23      1667.30
         4096         1000         1.51      2715.76
         8192         1000         2.05      3996.71
        16384         1000        11.67      1404.53
        32768         1000        15.15      2162.72
        65536          640         8.10      8092.45
       131072          320        10.65     12307.39
       262144          160        21.76     12048.62
       524288           80        42.67     12286.76
      1048576           40        85.03     12332.41
      2097152           20       170.70     12285.90
      4194304           10       340.08     12333.28

#---------------------------------------------------
# Benchmarking Unidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100         9.63         0.00
            4          100        15.18         0.26
            8          100        15.01         0.53
           16          100        14.90         1.07
           32          100        15.13         2.12
           64          100        15.05         4.25
          128          100        15.13         8.46
          256          100        15.06        17.00
          512          100        15.15        33.80
         1024          100        15.65        65.43
         2048          100        16.11       127.15
         4096          100        17.03       240.55
         8192          100        19.09       429.01
        16384          100        28.51       574.72
        32768          100        25.82      1269.06
        65536          100        39.72      1650.03
       131072          100        45.16      2902.46
       262144          100        55.43      4728.88
       524288           80        80.58      6506.47
      1048576           40       115.80      9055.07
      2097152           20       205.22     10219.10
      4194304           10       374.60     11196.66

#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         2.11         1.90
            8         1000         2.01         3.97
           16         1000         2.04         7.85
           32         1000         2.09        15.33
           64         1000         2.06        31.03
          128         1000         2.09        61.13
          256         1000         2.16       118.78
          512         1000         2.31       221.50
         1024         1000         2.50       409.71
         2048         1000         2.82       725.69
         4096         1000         3.91      1046.47
         8192         1000         6.56      1248.04
        16384         1000         7.02      2333.82
        32768         1000         7.33      4470.43
        65536          640         8.36      7836.86
       131072          320        13.08     10018.33
       262144          160        25.13     10429.94
       524288           80        47.26     11093.57
      1048576           40        91.02     11520.75
      2097152           20       174.09     12046.14
      4194304           10       349.57     11998.49

#---------------------------------------------------
# Benchmarking Bidir_Get 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        11.43         0.00
            4          100        18.62         0.21
            8          100        18.72         0.43
           16          100        18.87         0.85
           32          100        18.89         1.69
           64          100        18.71         3.42
          128          100        18.92         6.77
          256          100        19.04        13.45
          512          100        19.12        26.78
         1024          100        19.59        52.26
         2048          100        20.01       102.33
         4096          100        20.90       195.94
         8192          100        20.00       409.63
        16384          100        30.61       535.28
        32768          100        32.75      1000.50
        65536          100        43.29      1513.98
       131072          100        51.69      2535.89
       262144          100        58.96      4446.06
       524288           80        85.92      6101.83
      1048576           40       130.96      8007.00
      2097152           20       213.02      9845.09
      4194304           10       397.68     10546.87

#---------------------------------------------------
# Benchmarking Bidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.03         0.00
            4         1000         1.92         2.08
            8         1000         1.92         4.16
           16         1000         2.00         7.99
           32         1000         2.03        15.76
           64         1000         2.03        31.52
          128         1000         2.06        62.19
          256         1000         2.10       122.14
          512         1000         2.21       231.48
         1024         1000         2.40       427.27
         2048         1000         2.76       741.09
         4096         1000         3.81      1075.02
         8192         1000         6.34      1292.55
        16384         1000        23.47       698.21
        32768         1000        32.58      1005.76
        65536          640        12.68      5169.61
       131072          320        17.39      7537.35
       262144          160        31.98      8198.05
       524288           80        53.18      9858.33
      1048576           40        95.01     11036.50
      2097152           20       180.38     11626.59
      4194304           10       351.72     11925.29

#---------------------------------------------------
# Benchmarking Bidir_Put 
# #processes = 2 
#---------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100        13.05         0.00
            4          100        19.08         0.21
            8          100        18.70         0.43
           16          100        18.59         0.86
           32          100        18.66         1.72
           64          100        19.45         3.29
          128          100        19.15         6.68
          256          100        18.86        13.58
          512          100        19.55        26.20
         1024          100        19.94        51.36
         2048          100        19.84       103.22
         4096          100        22.06       185.65
         8192          100        24.07       340.30
        16384          100        40.60       403.57
        32768          100        43.03       761.56
        65536          100        53.03      1235.91
       131072          100        57.55      2277.55
       262144          100        70.38      3724.76
       524288           80        91.20      5748.52
      1048576           40       138.03      7596.92
      2097152           20       228.63      9172.63
      4194304           10       401.95     10434.89

#----------------------------------------------------------------
# Benchmarking Accumulate 
# #processes = 2 
#----------------------------------------------------------------
#
#    MODE: AGGREGATE 
#
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.03         0.03         0.03
            4         1000         0.91         0.95         0.93
            8         1000         0.91         0.95         0.93
           16         1000         0.97         1.01         0.99
           32         1000         0.98         1.02         1.00
           64         1000         0.98         1.02         1.00
          128         1000         1.00         1.05         1.02
          256         1000         1.01         1.05         1.03
          512         1000         1.08         1.13         1.11
         1024         1000         1.27         1.29         1.28
         2048         1000         1.64         1.65         1.64
         4096         1000         2.44         2.45         2.45
         8192         1000         3.85         3.86         3.86
        16384         1000        18.29        18.29        18.29
        32768         1000        28.28        28.28        28.28
        65536          640        29.22        29.23        29.23
       131072          320        51.52        51.56        51.54
       262144          160        91.34        91.40        91.37
       524288           80       184.28       184.37       184.32
      1048576           40       369.47       369.64       369.55
      2097152           20       741.26       741.59       741.42
      4194304           10      1544.48      1557.92      1551.20

#----------------------------------------------------------------
# Benchmarking Accumulate 
# #processes = 2 
#----------------------------------------------------------------
#
#    MODE: NON-AGGREGATE 
#
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0          100        12.86        12.87        12.87
            4          100        22.88        22.96        22.92
            8          100        21.99        22.02        22.00
           16          100        17.61        17.64        17.62
           32          100        17.97        17.99        17.98
           64          100        17.80        17.83        17.81
          128          100        18.09        18.20        18.14
          256          100        17.97        18.00        17.99
          512          100        18.05        18.07        18.06
         1024          100        18.51        18.54        18.52
         2048          100        18.99        19.02        19.00
         4096          100        21.51        21.54        21.53
         8192          100        23.89        23.91        23.90
        16384          100        32.03        32.06        32.04
        32768          100        36.14        36.16        36.15
        65536          100        55.69        55.72        55.71
       131072          100        79.86        79.89        79.88
       262144          100       124.43       124.46       124.44
       524288           80       201.18       201.21       201.19
      1048576           40       345.16       345.58       345.37
      2097152           20       671.46       671.95       671.70
      4194304           10      1563.12      1564.41      1563.76


# All processes entering MPI_Finalize

[0] 24 at [0x000055872365bfd0], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f16140], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f1d650], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f1d590], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f30ec0], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f02e90], src/mpi/group/grouputil.c[74]
[0] 24 at [0x0000558722f02ab0], src/mpi/group/grouputil.c[74]
[root@rdma-qe-15 ~]$ echo $?
0
[root@rdma-qe-15 ~]$ 

=====================================================================
Test results for mvapich2-psm2 on rdma-qe-15:
4.18.0-167.el8.x86_64, hfi1, opa0, & hfi1_0
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | mvapich2-psm2 IMB-MPI1 PingPong mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 PingPing mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Sendrecv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Exchange mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Bcast mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allgather mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allgatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Gather mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Gatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Scatter mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Scatterv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Alltoall mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Alltoallv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Reduce mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Reduce_scatter mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allreduce mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Barrier mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Write_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Read_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Write_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Read_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_shared mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_shared mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_priv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_priv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_indv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_expl mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_shared mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_shared mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Window mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Unidir_Put mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Unidir_Get mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Bidir_Get mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Bidir_Put mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Accumulate mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ibcast mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallgather mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallgatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Igather mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Igatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iscatter mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iscatterv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ialltoall mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ialltoallv mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ireduce mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ireduce_scatter mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallreduce mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ibarrier mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Unidir_put mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Unidir_get mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Bidir_put mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Bidir_get mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA One_put_all mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA One_get_all mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA All_put_all mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA All_get_all mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Put_local mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Put_all_local mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Exchange_put mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Exchange_get mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Accumulate mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_accumulate mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Fetch_and_op mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Compare_and_swap mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_local mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_all_local mpirun one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 PingPong mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 PingPing mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Sendrecv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Exchange mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Bcast mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allgather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allgatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Gather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Gatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Scatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Scatterv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Alltoall mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Alltoallv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Reduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Reduce_scatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Allreduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-MPI1 Barrier mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Write_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Read_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Write_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO S_Read_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_shared mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_shared mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Write_priv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO P_Read_priv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_indv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_expl mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Write_shared mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-IO C_Read_shared mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Window mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Unidir_Put mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Unidir_Get mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Bidir_Get mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Bidir_Put mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-EXT Accumulate mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ibcast mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallgather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallgatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Igather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Igatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iscatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iscatterv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ialltoall mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ialltoallv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ireduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ireduce_scatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Iallreduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-NBC Ibarrier mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Unidir_put mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Unidir_get mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Bidir_put mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Bidir_get mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA One_put_all mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA One_get_all mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA All_put_all mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA All_get_all mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Put_local mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Put_all_local mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Exchange_put mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Exchange_get mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Accumulate mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_accumulate mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Fetch_and_op mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Compare_and_swap mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_local mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 IMB-RMA Get_all_local mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU acc_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU allgather mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU allgatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU allreduce mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU alltoall mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU alltoallv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU barrier mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU bcast mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU bibw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU bw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU cas_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU fop_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU gather mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU gatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU get_acc_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU get_bw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU get_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU hello mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU iallgather mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU iallgatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoall mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoallv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoallw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU ibarrier mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU ibcast mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU igather mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU igatherv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU init mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU iscatter mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU iscatterv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU mbw_mr mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU multi_lat mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU put_bibw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU put_bw mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU put_latency mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU reduce mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU reduce_scatter mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU scatter mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU scatterv mpirun one_core
      PASS |      0 | mvapich2-psm2 OSU acc_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU allgather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU allgatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU allreduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU alltoall mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU alltoallv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU barrier mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU bcast mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU bibw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU bw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU cas_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU fop_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU gather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU gatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU get_acc_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU get_bw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU get_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU hello mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU iallgather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU iallgatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoall mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoallv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU ialltoallw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU ibarrier mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU ibcast mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU igather mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU igatherv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU init mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU iscatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU iscatterv mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU mbw_mr mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU multi_lat mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU put_bibw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU put_bw mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU put_latency mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU reduce mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU reduce_scatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU scatter mpirun_rsh one_core
      PASS |      0 | mvapich2-psm2 OSU scatterv mpirun_rsh one_core

Checking for failures and known issues:
  no test failures

Comment 21 Afom T. Michael 2019-12-23 20:40:09 UTC
Moving to verified since our tests passed as shown on comment #20.

Comment 23 errata-xmlrpc 2020-04-28 16:56:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1865


Note You need to log in before you can comment on or make changes to this bug.