Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1808600 - openmpi: Segmentation fault: invalid permissions for mapped
Summary: openmpi: Segmentation fault: invalid permissions for mapped
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: openmpi
Version: 8.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Honggang LI
QA Contact: Infiniband QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-28 22:33 UTC by Christoph Junghans
Modified: 2020-11-14 04:33 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-03 02:27:54 UTC
Type: Bug
Target Upstream Version:
junghans: needinfo+


Attachments (Terms of Use)

Description Christoph Junghans 2020-02-28 22:33:34 UTC
From building gromacs-2019.6 on epel8:

26/27 Test #26: SimdUnitTests ....................***Exception: SegFault  0.58 sec
[buildvm-03:3011859:0:3011859] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x7fd431277768)
==== backtrace ====
    0  /lib64/libucs.so.0(+0x18bb0) [0x7fd430c0abb0]
    1  /lib64/libucs.so.0(+0x18d8a) [0x7fd430c0ad8a]
    2  /lib64/libuct.so.0(+0x1655b) [0x7fd431b0055b]
    3  /lib64/ld-linux-x86-64.so.2(+0xfd0a) [0x7fd44329dd0a]
    4  /lib64/ld-linux-x86-64.so.2(+0xfe0a) [0x7fd44329de0a]
    5  /lib64/ld-linux-x86-64.so.2(+0x13def) [0x7fd4432a1def]
    6  /lib64/libc.so.6(_dl_catch_exception+0x77) [0x7fd43f1e2ab7]
    7  /lib64/ld-linux-x86-64.so.2(+0x1365e) [0x7fd4432a165e]
    8  /lib64/libdl.so.2(+0x11ba) [0x7fd442b401ba]
    9  /lib64/libc.so.6(_dl_catch_exception+0x77) [0x7fd43f1e2ab7]
   10  /lib64/libc.so.6(_dl_catch_error+0x33) [0x7fd43f1e2b53]
   11  /lib64/libdl.so.2(+0x1939) [0x7fd442b40939]
   12  /lib64/libdl.so.2(dlopen+0x4a) [0x7fd442b4025a]
   13  /usr/lib64/openmpi/lib/libopen-pal.so.40(+0x6df05) [0x7fd43ebacf05]
   14  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_repository_open+0x206) [0x7fd43eb8ab16]
   15  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_find+0x35a) [0x7fd43eb89a5a]
   16  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_components_register+0x2e) [0x7fd43eb953ce]
   17  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_register+0x252) [0x7fd43eb958b2]
   18  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_open+0x15) [0x7fd43eb95915]
   19  /usr/lib64/openmpi/lib/libmpi.so.40(ompi_mpi_init+0x674) [0x7fd442d94494]
   20  /usr/lib64/openmpi/lib/libmpi.so.40(PMPI_Init_thread+0x55) [0x7fd442dc4805]
   21  /builddir/build/BUILD/gromacs-2019.6/openmpi/bin/simd-test(+0x112d35) [0x5639771a3d35]
   22  /builddir/build/BUILD/gromacs-2019.6/openmpi/bin/simd-test(+0xf01cd) [0x5639771811cd]
   23  /builddir/build/BUILD/gromacs-2019.6/openmpi/bin/simd-test(+0xcfe6a) [0x563977160e6a]
   24  /builddir/build/BUILD/gromacs-2019.6/openmpi/bin/simd-test(+0x62fd0) [0x5639770f3fd0]
   25  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7fd43f0cd873]
   26  /builddir/build/BUILD/gromacs-2019.6/openmpi/bin/simd-test(+0x6390e) [0x5639770f490e]
===================
(see https://koji.fedoraproject.org/koji/taskinfo?taskID=41996718)

We had this on rawhide a while back as well, so we need to update openmpi.
This might be bug#1744780.

Comment 1 Honggang LI 2020-03-02 02:20:48 UTC
https://access.redhat.com/solutions/3592

Please provide the sosreport. Thanks

Comment 2 Honggang LI 2020-03-02 02:37:47 UTC
(In reply to Christoph Junghans from comment #0)

> (see https://koji.fedoraproject.org/koji/taskinfo?taskID=41996718)


https://kojipkgs.fedoraproject.org//work/tasks/6793/41996793/root.log

DEBUG util.py:598:   ucx                      x86_64   1.4.0-3.el8                    build   416 k
DEBUG util.py:598:   openmpi                  x86_64   4.0.1-3.el8                    build   2.8 M

It seems you are using the old release/build of ucx and openmpi. Could you please try latest
build for ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 ?

> We had this on rawhide a while back as well, so we need to update openmpi.

We are likely need to update ucx, not openmpi.

Comment 3 Christoph Junghans 2020-03-02 13:57:43 UTC
(In reply to Honggang LI from comment #2)
> (In reply to Christoph Junghans from comment #0)
> 
> > (see https://koji.fedoraproject.org/koji/taskinfo?taskID=41996718)
> 
> 
> https://kojipkgs.fedoraproject.org//work/tasks/6793/41996793/root.log
> 
> DEBUG util.py:598:   ucx                      x86_64   1.4.0-3.el8          
> build   416 k
> DEBUG util.py:598:   openmpi                  x86_64   4.0.1-3.el8          
> build   2.8 M
> 
> It seems you are using the old release/build of ucx and openmpi. Could you
> please try latest
> build for ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 ?

How do I get these in epel8/ mock?

Comment 4 Don Dutile (Red Hat) 2020-03-02 23:04:07 UTC
(In reply to Christoph Junghans from comment #3)
> (In reply to Honggang LI from comment #2)
> > (In reply to Christoph Junghans from comment #0)
> > 
> > > (see https://koji.fedoraproject.org/koji/taskinfo?taskID=41996718)
> > 
> > 
> > https://kojipkgs.fedoraproject.org//work/tasks/6793/41996793/root.log
> > 
> > DEBUG util.py:598:   ucx                      x86_64   1.4.0-3.el8          
> > build   416 k
> > DEBUG util.py:598:   openmpi                  x86_64   4.0.1-3.el8          
> > build   2.8 M
> > 
> > It seems you are using the old release/build of ucx and openmpi. Could you
> > please try latest
> > build for ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 ?
> 
> How do I get these in epel8/ mock?

ucx & openmpi are part of the RHEL release, and thus, not available in epel.

Comment 5 Christoph Junghans 2020-03-03 00:01:20 UTC
As this is an EPEL8 not an RHEL8 issue, this can be closed.

Comment 6 Honggang LI 2020-03-03 02:24:13 UTC
(In reply to Christoph Junghans from comment #3)

> > DEBUG util.py:598:   ucx                      x86_64   1.4.0-3.el8          
> > build   416 k
> > DEBUG util.py:598:   openmpi                  x86_64   4.0.1-3.el8          
> > build   2.8 M

ucx-1.4.0-3.el8 and openmpi-4.0.1-3.el8 are available for epel8, because they had
been released in RHEL-8.1 distro. That means they are available for CENTOS-8. That
is why they are used when you build package with mock epel8 configuration.

> > It seems you are using the old release/build of ucx and openmpi. Could you
> > please try latest
> > build for ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 ?
> 
> How do I get these in epel8/ mock?

ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 are not public released yet. They had
been built for RHEL-8.2. When RHEL-8.2 released, they will be available for
CENTOS-8.2 too. So, please wait.

Comment 7 Honggang LI 2020-03-03 02:27:54 UTC
(In reply to Christoph Junghans from comment #5)
> As this is an EPEL8 not an RHEL8 issue, this can be closed.

Close this bug. Please reopen it or file a new bug, if issue persist with
 ucx-1.6.1-1.el8 and openmpi-4.0.2-2.el8 when centos-8.2 released.


Note You need to log in before you can comment on or make changes to this bug.