Bug 1799473 - gromacs: FTBFS in Fedora rawhide/f32
Summary: gromacs: FTBFS in Fedora rawhide/f32
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gromacs
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Christoph Junghans
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F32FTBFS
TreeView+ depends on / blocked
 
Reported: 2020-02-06 17:16 UTC by Fedora Release Engineering
Modified: 2020-02-25 02:43 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-23 23:34:30 UTC
Type: ---
Embargoed:
junghans: needinfo+


Attachments (Terms of Use)
build.log (32.00 KB, text/plain)
2020-02-06 17:16 UTC, Fedora Release Engineering
no flags Details
root.log (32.00 KB, text/plain)
2020-02-06 17:16 UTC, Fedora Release Engineering
no flags Details
state.log (990 bytes, text/plain)
2020-02-06 17:17 UTC, Fedora Release Engineering
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 93750 0 P3 RESOLVED Altivec and std=c++11 2020-03-08 11:18:29 UTC
Github pmodels mpich issues 4318 0 None open INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in MPID_nem_tcp_init:373 2020-03-08 11:18:28 UTC

Internal Links: 1803964

Description Fedora Release Engineering 2020-02-06 17:16:54 UTC
gromacs failed to build from source in Fedora rawhide/f32

https://koji.fedoraproject.org/koji/taskinfo?taskID=41318028


For details on the mass rebuild see:

https://fedoraproject.org/wiki/Fedora_32_Mass_Rebuild
Please fix gromacs at your earliest convenience and set the bug's status to
ASSIGNED when you start fixing it. If the bug remains in NEW state for 8 weeks,
gromacs will be orphaned. Before branching of Fedora 33,
gromacs will be retired, if it still fails to build.

For more details on the FTBFS policy, please visit:
https://fedoraproject.org/wiki/Fails_to_build_from_source

Comment 1 Fedora Release Engineering 2020-02-06 17:16:57 UTC
Created attachment 1659221 [details]
build.log

file build.log too big, will only attach last 32768 bytes

Comment 2 Fedora Release Engineering 2020-02-06 17:16:59 UTC
Created attachment 1659222 [details]
root.log

file root.log too big, will only attach last 32768 bytes

Comment 3 Fedora Release Engineering 2020-02-06 17:17:01 UTC
Created attachment 1659223 [details]
state.log

Comment 4 Christoph Junghans 2020-02-06 17:43:04 UTC
The ppc64 error:
-- Could not find any flag to build test source (this could be due to either the compiler or binutils)
CMake Error at cmake/gmxManageSimd.cmake:51 (message):
  Cannot find IBM VSX compiler flag.  Use a newer compiler, or disable SIMD
  support (slower).
Call Stack (most recent call first):
  cmake/gmxManageSimd.cmake:265 (gmx_give_fatal_error_when_simd_support_not_found)
  CMakeLists.txt:719 (gmx_manage_simd)
-- Configuring incomplete, errors occurred!

Something is wrong with SIMD flag.

On aarch64 the error is:
Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
Abnormal return value for ' gmx mdrun    -nb cpu   -notunepme >mdrun.out 2>&1' was 1
Retrying mdrun with better settings...
.....
98% tests passed, 1 tests failed out of 46
Label Time Summary:
GTest              =  33.68 sec*proc (40 tests)
IntegrationTest    =   6.53 sec*proc (5 tests)
MpiTest            =   2.22 sec*proc (3 tests)
SlowTest           =  20.76 sec*proc (1 test)
UnitTest           =   6.39 sec*proc (34 tests)
Total Test time (real) = 2697.26 sec
The following tests FAILED:
         43 - regressiontests/kernel (Timeout)
Errors while running CTest

Comment 5 Christoph Junghans 2020-02-06 17:53:57 UTC
Using "mock -r fedora-rawhide-ppc64le --no-clean gromacs-2019.5-2.fc32.1.src.rpm"
I get a:
+++ /usr/bin/ps -p 160 -ocomm=
Signal 4 (ILL) caught by ps (3.3.15).
/usr/bin/ps:ps/display.c:66: please report this bug
++ my_shell=

Comment 6 Ben Cotton 2020-02-11 17:06:25 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 7 Christoph Junghans 2020-02-12 03:02:03 UTC
Details on the aarch64 error:
22/27 Test #22: UtilityMpiUnitTests ..............***Failed    0.52 sec
Invalid error code (-2) (error ring index 127 invalid)
INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in MPID_nem_tcp_init:373
Invalid error code (-2) (error ring index 127 invalid)
INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in MPID_nem_tcp_init:373
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(586)..............:
MPID_Init(224).....................: channel initialization failed
MPIDI_CH3_Init(105)................:
MPID_nem_init(324).................:
MPID_nem_tcp_init(175).............:
MPID_nem_tcp_get_business_card(401):
MPID_nem_tcp_init(373).............: gethostbyname failed, 9642102373514ac7b8330d80c6ee96d2 (errno 0)
Invalid error code (-2) (error ring index 127 invalid)
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(586)..............:
MPID_Init(224).....................: channel initialization failed
MPIDI_CH3_Init(105)................:
MPID_nem_init(324).................:
MPID_nem_tcp_init(175).............:
MPID_nem_tcp_get_business_card(401):
MPID_nem_tcp_init(373).............: gethostbyname failed, 9642102373514ac7b8330d80c6ee96d2 (errno 0)

So this seems to be a bug in mpich.

Comment 8 Christoph Junghans 2020-02-14 22:39:25 UTC
ppc64le issue reported upstream: https://redmine.gromacs.org/issues/3380

Comment 9 Christoph Junghans 2020-02-14 22:40:13 UTC
(In reply to Christoph Junghans from comment #7)
> Details on the aarch64 error:
> 22/27 Test #22: UtilityMpiUnitTests ..............***Failed    0.52 sec
> Invalid error code (-2) (error ring index 127 invalid)
> INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in
> MPID_nem_tcp_init:373
> Invalid error code (-2) (error ring index 127 invalid)
> INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in
> MPID_nem_tcp_init:373
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(586)..............:
> MPID_Init(224).....................: channel initialization failed
> MPIDI_CH3_Init(105)................:
> MPID_nem_init(324).................:
> MPID_nem_tcp_init(175).............:
> MPID_nem_tcp_get_business_card(401):
> MPID_nem_tcp_init(373).............: gethostbyname failed,
> 9642102373514ac7b8330d80c6ee96d2 (errno 0)
> Invalid error code (-2) (error ring index 127 invalid)
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(586)..............:
> MPID_Init(224).....................: channel initialization failed
> MPIDI_CH3_Init(105)................:
> MPID_nem_init(324).................:
> MPID_nem_tcp_init(175).............:
> MPID_nem_tcp_get_business_card(401):
> MPID_nem_tcp_init(373).............: gethostbyname failed,
> 9642102373514ac7b8330d80c6ee96d2 (errno 0)
> 
> So this seems to be a bug in mpich.

MPICH issuue patched here: https://src.fedoraproject.org/rpms/mpich/pull-request/2

Comment 10 Fedora Release Engineering 2020-02-16 04:26:47 UTC
Dear Maintainer,

your package has not been built successfully in 32. Action is required from you.

If you can fix your package to build, perform a build in koji, and either create
an update in bodhi, or close this bug without creating an update, if updating is
not appropriate [1]. If you are working on a fix, set the status to ASSIGNED to
acknowledge this. Following the latest policy for such packages [2], your package
will be orphaned if this bug remains in NEW state more than 8 weeks.

A week before the mass branching of Fedora 33 according to the schedule [3],
any packages not successfully rebuilt at least on Fedora 31 will be
retired regardless of the status of this bug.

[1] https://fedoraproject.org/wiki/Updates_Policy
[2] https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/
[3] https://fedoraproject.org/wiki/Releases/33/Schedule


Note You need to log in before you can comment on or make changes to this bug.