Bug 1794127 - gcc 10: template instantiation failure on s390x
Summary: gcc 10: template instantiation failure on s390x
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: z3
Version: 32
Hardware: s390x
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jerry James
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-22 18:00 UTC by Jerry James
Modified: 2021-05-25 17:35 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 17:35:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Compressed tar file containing F31 & Rawhide ii files (310.57 KB, application/x-xz)
2020-01-25 22:42 UTC, Jerry James
no flags Details

Description Jerry James 2020-01-22 18:00:07 UTC
Description of problem:
I am attempting to build the z3 package due to the recent ocaml update.  On s390x only, the build fails due to failure to instantiate a template:

/usr/bin/ld: util/lp/lp.a(lar_solver.o): in function `lp::lar_solver::update_x_and_inf_costs_for_columns_with_changed_bounds_tableau()':
/builddir/build/BUILD/z3-z3-4.8.7/build/../src/util/lp/lar_solver.cpp:838: undefined reference to `lp::lp_primal_core_solver<rational, lp::numeric_pair<rational> >::update_inf_cost_for_column_tableau(unsigned int)'
/usr/bin/ld: util/lp/lp.a(lar_solver.o): in function `lp::lar_solver::clean_inf_set_of_r_solver_after_pop()':
/builddir/build/BUILD/z3-z3-4.8.7/build/../src/util/lp/lar_solver.cpp:1547: undefined reference to `lp::lp_primal_core_solver<rational, lp::numeric_pair<rational> >::update_inf_cost_for_column_tableau(unsigned int)'
/usr/bin/ld: /builddir/build/BUILD/z3-z3-4.8.7/build/../src/util/lp/lar_solver.cpp:1549: undefined reference to `lp::lp_primal_core_solver<rational, lp::numeric_pair<rational> >::update_inf_cost_for_column_tableau(unsigned int)'
collect2: error: ld returned 1 exit status

This did not happen with the previous version of gcc, and the sources are unchanged.  It is suspicious that this is happening only on 390x:

https://koji.fedoraproject.org/koji/buildinfo?buildID=1430613

Version-Release number of selected component (if applicable):
gcc-10.0.1-0.4.fc32.s390x

How reproducible:
Twice so far.

Steps to Reproduce:
1. fedpkg clone z3
2. cd z3
3. fedpkg build

Actual results:
The build fails on s390x due to template instantiation failure, and succeeds on all other architectures.

Expected results:
Successful build on all architectures.

Additional info:
This package is blocking more builds I need to do, so I'm going to attempt to force the template instantiation.  If you see evidence of me trying to do that in the git history, roll back to the previous commit to see the problem.

Comment 1 Marek Polacek 2020-01-22 18:20:08 UTC
First we'll need the .ii file to be able to bisect what has changed; are you able to help us getting that?

Probably some optimization kicked in, requiring an explicit instantiation where it previously wasn't needed.

Comment 2 Jeff Law 2020-01-22 19:03:15 UTC
I would hazard a guess it was all of Jan's retuning of the inliner.  I've seen, reported and fixed a ton of these kinds of issues already.  Most were old style C code, but there certainly were cases where changes in the inlining decisions made ultimately required some template instantiations where they weren't needed before (the packages with embedded copies of yahttp for example need explicit instantiations of template class AsyncLoader<Request>).



According to my builder z3 builds on x86_64 without LTO, so something particular about the Z series port must be coming into play.

Comment 3 Jakub Jelinek 2020-01-22 19:04:04 UTC
We might need actually two .ii files (+ corresponding g++ options), the one that previously provided the symbols (or that still provides them on some other arch) that are no longer available and are needed, and one which needs it.

Comment 4 Jerry James 2020-01-22 21:35:33 UTC
(In reply to Jeff Law from comment #2)
> I would hazard a guess it was all of Jan's retuning of the inliner.  I've
> seen, reported and fixed a ton of these kinds of issues already.  Most were
> old style C code, but there certainly were cases where changes in the
> inlining decisions made ultimately required some template instantiations
> where they weren't needed before (the packages with embedded copies of
> yahttp for example need explicit instantiations of template class
> AsyncLoader<Request>).

Okay.  I've prepared a patch to add an explicit instantiation, and I'm going to give it a try momentarily.  Z3 takes awhile to build, so I won't know if I succeeded for a little while.

> According to my builder z3 builds on x86_64 without LTO, so something
> particular about the Z series port must be coming into play.

Right, it builds successfully on all other Fedora arches, just not s390x.

Tonight I will see if I still have a remote account to an s390x machine.  If so, I should be able to get the .ii files for you.  If not, I'll have to figure out how to hack up %build so that it cats the .ii files into the build log.

Comment 5 Jerry James 2020-01-25 22:42:18 UTC
Created attachment 1655332 [details]
Compressed tar file containing F31 & Rawhide ii files

I have created ii files for both src/util/lp/lar_solver.cpp, which is where the reference occurs, and src/util/lp/lp_primal_core_solver.cpp, which is the only place where the definition of update_inf_cost_for_column_tableau is visible.  I did this for both F31, where the template instantiation succeeds, and rawhide, where it fails.  In both cases and for both files, g++ is invoked with these flags:

g++ -D_MP_GMP -DZ3GITHASH=30e7c225cd51 -DNDEBUG -D_EXTERNAL_RELEASE -D_USE_THREAD_LOCAL -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=zEC12 -mtune=z13 -fasynchronous-unwind-tables -fstack-clash-protection -std=c++11 -c -D_LINUX_ -fPIC -I../src/util -I../src/nlsat -I../src/math/polynomial -I../src/sat -I../src

Comment 6 Jerry James 2020-01-25 22:44:12 UTC
I should note that I did get the z3 build to succeed by adding an explicit template instantiation to lp_primal_core_solver.cpp.

Comment 7 Jonathan Wakely 2020-01-27 12:50:08 UTC
The explicit instantiation is required to make the code correct. The standard says in [temp.pre] p10:

A function template, member function of a class template, variable template, or static data member of a class template shall be defined in every translation unit in which it is implicitly instantiated unless the corresponding specialization is explicitly instantiated in some translation unit; no diagnostic is required.

Comment 8 Jakub Jelinek 2020-01-27 12:55:04 UTC
Reassigning to z3, as the bug is in there.

Comment 9 Jonathan Wakely 2020-01-27 12:58:53 UTC
For C++11 that text is in [temp] p6 but the rule is the same. In effect, it means an implicit instantiation in lp_primal_core_solver.cpp is not guaranteed to emit a symbol definition that the linker can use for other implicit instantiations in other translation units.

Comment 10 Jerry James 2020-01-29 03:39:16 UTC
Thank you Jonathan.  I'll submit a pull request upstream and include this information.  Thanks everybody for the help.

Comment 11 Jerry James 2020-01-29 03:47:32 UTC
https://github.com/Z3Prover/z3/pull/2899

Comment 12 Ben Cotton 2020-02-11 17:23:51 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 13 Fedora Program Management 2021-04-29 17:00:31 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 14 Ben Cotton 2021-05-25 17:35:07 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.