Bug 1728057 - espresso fails to build in rawhide (Fedora 31)
Summary: espresso fails to build in rawhide (Fedora 31)
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: espresso
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Christoph Junghans
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/espressomd/espress...
Whiteboard:
: 1735195 (view as bug list)
Depends On: 1746564
Blocks: F31FTBFS PYTHON38 1732841
TreeView+ depends on / blocked
 
Reported: 2019-07-08 23:03 UTC by Miro Hrončok
Modified: 2019-09-10 18:07 UTC (History)
6 users (show)

Fixed In Version: espresso-4.0.2-7.fc31, espresso-4.0.2-7.fc32
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-10 18:07:53 UTC
Type: Bug


Attachments (Terms of Use)

Description Miro Hrončok 2019-07-08 23:03:29 UTC
espresso fails to build with Python 3.8.0b2.

Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.Nh6GKt
+ umask 022
+ cd /builddir/build/BUILD
+ cd /builddir/build/BUILD
+ rm -rf espresso
+ /usr/bin/gzip -dc /builddir/build/SOURCES/espresso-4.0.2.tar.gz
+ /usr/bin/tar -xof -
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd espresso
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ echo 'Patch #0 (2946.patch):'
+ /usr/bin/patch --no-backup-if-mismatch -p1 --fuzz=0
error: Bad exit status from /var/tmp/rpm-tmp.Nh6GKt (%prep)
    Bad exit status from /var/tmp/rpm-tmp.Nh6GKt (%prep)
Patch #0 (2946.patch):
patching file src/core/io/writer/CMakeLists.txt
patching file .gitlab-ci.yml
Hunk #2 FAILED at 297.
Hunk #3 FAILED at 310.
2 out of 3 hunks FAILED -- saving rejects to file .gitlab-ci.yml.rej

This doesn't seem Python 3.8 related, yet it blocks the 3.8 rebuild.

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.8/fedora-rawhide-x86_64/00964737-espresso/

For all our attempts to build espresso with Python 3.8, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.8/package/espresso/

Testing and mass rebuild of packages is happening in copr. You can follow these instructions to test locally in mock if your package builds with Python 3.8:
https://copr.fedorainfracloud.org/coprs/g/python/python3.8/

Let us know here if you have any questions.

Comment 1 Christoph Junghans 2019-07-09 19:41:07 UTC
commit 4a3b1a4f9fe2d0c49d64899275db58e7f434bddb (HEAD -> master, origin/master, origin/HEAD)
Author: Christoph Junghans <junghans@votca.org>
Date:   Tue Jul 9 13:40:24 2019 -0600

    use 2946.patch explicitly for bug #1728057

diff --git a/espresso.spec b/espresso.spec
index 4402044..0e3fd13 100644
--- a/espresso.spec
+++ b/espresso.spec
@@ -14,7 +14,8 @@ Source0:        https://github.com/%{name}md/%{name}/archive/%{commit}/%{name}-%
 %else
 Source0:       https://github.com/%{name}md/%{name}/releases/download/%{version}/%{name}-%{version}.tar.gz
 # Add missing so number to libH5mdCore
-Patch0:        https://github.com/espressomd/espresso/pull/2946.patch
+# https://github.com/espressomd/espresso/pull/2946
+Patch0:        2946.patch
 %endif

Comment 2 Miro Hrončok 2019-07-10 12:06:13 UTC
The package still fails to build:

https://koji.fedoraproject.org/koji/taskinfo?taskID=36164517

Comment 3 Christoph Junghans 2019-07-10 13:28:31 UTC
Now an actual test fails:

23/41 Test #23: ParticleCache_test ...............***Failed    2.65 sec
Running 7 test cases...
Running 7 test cases...
unknown location(0): [4;31;49mfatal error: in "update": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag
[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(120): [1;36;49mlast checkpoint[0;39;49m
unknown location(0): [4;31;49mfatal error: in "update_with_bonds": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_RANK: invalid rank
[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(129): [1;36;49mlast checkpoint: "update_with_bonds" test entry[0;39;49m
[1;31;49m*** 3 failures are detected in the test module "ParticleCache test"
unknown location(0): [4;31;49mfatal error: in "iterators": Throw location unknown (consider using BOOST_THROW_EXCEPTION)
Dynamic exception type: boost::wrapexcept<boost::mpi::exception>
std::exception::what: MPI_Recv: MPI_ERR_TAG: invalid tag
[0;39;49m
/builddir/build/BUILD/espresso/src/core/unit_tests/ParticleCache_test.cpp(195): [1;36;49mlast checkpoint[0;39;49m
[0;39;49m
[1;32;49m*** No errors detected
[0;39;49m[1562754342.323155] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7ea5afc0 was not returned to mpool ucp_am_bufs
[1562754342.323260] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7df928e0 was not returned to mpool mm_recv_desc
[1562754342.323273] [buildvm-31:11595:0]          mpool.c:37   UCX  WARN  object 0x7f0d7df94960 was not returned to mpool mm_recv_desc
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
  Process name: [[57821,1],0]
  Exit code:    201
--------------------------------------------------------------------------


Reported upstream: https://github.com/espressomd/espresso/issues/2985

Comment 4 Miro Hrončok 2019-07-31 19:57:14 UTC
*** Bug 1735195 has been marked as a duplicate of this bug. ***

Comment 5 Ben Cotton 2019-08-13 18:44:11 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to 31.

Comment 6 Miro Hrončok 2019-08-14 22:24:33 UTC
The coordinated rebuild of Python 3.8 has started in the `f32-python` side tag.

If you figure out how to rebuild this package, please don't rebuild it in regular rawhide, but use the side tag instead:

    on branch master:
    $ fedpkg build --target=f32-python

To wait for a build to show up in the side tag, do:

    $ koji wait-repo f32-python --build=<nvr>

Where <nvr> is name-version-release of the source package, e.g. python-foo-1.1-2.fc32.

An updated mock config is posted at:
http://copr.fedorainfracloud.org/coprs/g/python/python3.8/

Note that it will take a while before the essential packages are rebuilt, so don't expect all your dependencies to be available right away.

Thanks. Let us know if you need up to date info, or if you have any questions.



PS this message is mass posted to all the bugs that block the PYTHON38 bug. If this is also a Fedora 31 FTBFS bug and you manage to fix it, you can do a f31 build as usual:

    on branch f31:
    $ fedpkg build

Comment 7 Miro Hrončok 2019-08-21 16:35:18 UTC
The f32-python side tag has been merged. In order to rebuild the package, do it in regular rawhide, but please wait until python3-3.8 is tagged:

  $ koji wait-repo f32-build --build python3-3.8.0~b3-3.fc32


If your built already started in f32-python, after it is finished, please tag it to rawhide with:

  $ koji tag-build f32-pending <nvr>

For example:

  $ koji tag-build f32-pending libreoffice-6.3.0.4-3.fc32

Thanks!

(This comment is mass posted to all bugzillas blocking the PYTHON38 tracking bug.)

Comment 8 Christoph Junghans 2019-08-21 16:48:29 UTC
Upstream did a detailed analysis, see https://github.com/espressomd/espresso/issues/2985#issuecomment-523062014, this is actually a problem in openmpi-4 and boost.mpi-1.69.

Comment 9 Miro Hrončok 2019-08-21 17:29:21 UTC
(Python 3.8 has landed in the rawhide buildroot.)

Comment 10 Jean-Noël Grad 2019-08-29 16:16:19 UTC
Based on the investigation in Bug 1746564, and using ucx devel + openmpi 4.0.1 + boost 1.69 all compiled from sources with the patch from openmpi 4.0.2, I was able to compile espresso 4.0.2 in Fedora 31 and pass the C++ unit tests, the Python tests, and the subset of sample tests that do not depend on graphical Python modules.

Comment 11 Philip Kovacs 2019-09-01 17:56:06 UTC
The f31 test build passed all arches except ppc64le where it failed on the openmpi side.  113/115 tests passed, but:

The following tests FAILED:
      7 - coulomb_tuning (Failed)
      113 - lb_shear (Failed)


Looking at the root logs of the successful f32 vs failed f31 builds, any dependency difference could be the problem,
but this one catches my eye:

ucx   ppc64le   1.5.2-2.fc31
ucx   ppc64le   1.6.0-1.fc32

Comment 12 Christoph Junghans 2019-09-04 16:33:08 UTC
I wasn't able to reproduce this on ppc64le on f31: https://koji.fedoraproject.org/koji/taskinfo?taskID=37456788, seems to work or am I missing something.

Comment 13 Philip Kovacs 2019-09-05 06:11:43 UTC
Just wanted to add that I asked the Mellanox guys to update ucx to 1.6 in F31 which they did:

https://koji.fedoraproject.org/koji/buildinfo?buildID=1370954

All the ucx libraries reamain at version .so.0, so no rebuilds of ucx dependencies are needed
-- you could just create an F31 build root override for that ucx build and openmpi will use it.
That said, I am not sure ucx was the problem.  Just adding that it is available at version 1.6 
in F31 if you need it.


Note You need to log in before you can comment on or make changes to this bug.