Bug 1950149

Summary: [regression] elpa testsuite hangs with mpich-3.4 (works with 3.3)
Product: [Fedora] Fedora Reporter: Dominik 'Rathann' Mierzejewski <dominik>
Component: mpichAssignee: Zbigniew Jędrzejewski-Szmek <zbyszek>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 34CC: dakingun, zbyszek
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-08 01:13:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dominik 'Rathann' Mierzejewski 2021-04-15 21:30:40 UTC
Description of problem:
elpa testsuite hangs with mpich 3.4 on all arches. It used to work on F33+ (mpich-3.3.x).

Version-Release number of selected component (if applicable):
3.4.1-1.fc34

How reproducible:
Always.

Steps to Reproduce:
1. Try building elpa-2020.05.001-2.fc34 on F34+.

Actual results:
The build hangs with 100% CPU usage in %check section when running tests under mpich.

Expected results:
Tests pass successfully.

Comment 1 Zbigniew Jędrzejewski-Szmek 2021-05-17 09:10:01 UTC
The test passes for me here (both with the mpich from f34 and a freshly rebuilt mpich in rawhide mock). 
Maybe try again? Or provide more details.

Comment 2 Dominik 'Rathann' Mierzejewski 2021-07-07 09:11:03 UTC
I'm afraid I can still reproduce this in mock locally. What details do you need here?

Comment 3 Dominik 'Rathann' Mierzejewski 2021-07-07 09:53:29 UTC
I've just run a new scratch build of 2021.05.001[1] and it's hanging in these two tests under mpich (plain only, OpenMPI succeeds):

validate_c_version_complex_double_eigenvectors_2stage_default_kernel_random
validate_c_version_real_double_eigenvectors_2stage_default_kernel_random

So, I'm going to disable these two for now.

[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=71442735

Comment 4 Zbigniew Jędrzejewski-Szmek 2022-04-11 08:35:33 UTC
Hmm, I now see that %check does not fail on error.
I see that it "fails" left and right, with various workers killed by SIGILL.
I'm not even sure now if I looked at the detailed logs when testing this before.
I tried with the two tests disabled and enabled, and I see failures in both
cases.

Anyway, I built mpich-3.4.3 for F36 and mpich-4.0.2 for rawhide. Please check if
this improves things for you.

Comment 5 Ben Cotton 2022-05-12 16:48:02 UTC
This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 6 Ben Cotton 2022-06-08 01:13:48 UTC
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07.

Fedora Linux 34 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release.

Thank you for reporting this bug and we are sorry it could not be fixed.