Bug 1914777
| Summary: | bout++ fails to build with Python 3.10: test-multigrid_laplace - timeout | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Tomáš Hrnčiar <thrnciar> |
| Component: | bout++ | Assignee: | david08741 |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rawhide | CC: | david08741, mhroncok, thrnciar |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-01-17 14:29:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1890881 | ||
|
Description
Tomáš Hrnčiar
2021-01-11 08:38:25 UTC
IIRC this should only happen in Copr and not Koji. A workaround is to enable network access. See https://bugzilla.redhat.com/show_bug.cgi?id=1793612#c1 for details. I don't think as it is that simple, the MPI issues is I think fixed, at least on rawhide. The test should not be particular slow, either, normally 20 to 30 secs, so well below the 600 secs. I will try to investigate this, and thus keep the bug open. I am tempted to say this is an issue that copr is not having enough cores. Even though the test only uses 3 threads - that might be sufficient to trigger the timeout. On an old 2-core system the test finishes in about 4 seconds if it is using 1 thread, but with 3 threads it takes over 4 minutes. I am not sure what copr is using, but I think it is also using old CPUs and very few CPU (1?) - in which case it might take well more then 10 minutes. On a decent 64 core system the single tread version takes 1.3 seconds and 1.0 with 3 threads. If this keeps being an issue, and I can disable the test on copr or if there is only one core available. The underlying issue is that MPI is optimized to be fast on non-oversubscribed systems. While in the real world MPI should never be used oversubscribed, this is common for testing, in which case the "idle" threads are busy waiting on the other threads ... Any explanation why it works with network enabled? Pure luck - I guess ... Timeout is 600 seconds, in the case with network enabled it took: test-multigrid_laplace ✓ 588.655 s In that case increasing the time-out might be the most easy solution ... I have increased the timeout from 10m to 15m, I think that should fix the issue. |