Bug 1956587

Summary: Fedora 33 GCC 10.3.1 internal compiler error compiling pytorch
Product: [Fedora] Fedora Reporter: Mike Neilly <mneilly>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 33CC: aoliva, dmalcolm, fweimer, jakub, jwakely, law, mpolacek, msebor, nickc, sipoyare
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-30 18:44:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ii files for failing GCC 10/NVCC compile
none
Preprocessed source generated by crash none

Description Mike Neilly 2021-05-04 02:10:15 UTC
Description of problem:

I'm trying to build pytorch on Fedora 33 with cuda-toolkit-11-2 and receiving an "internal compiler error" as follows:

#8 327.9 [ 60%] Built target nccl_slim_external
#8 340.4 /usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template using __is_harmonic = std::__bool_constant<(std::ratio<((_Period
2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, Period::den))), ((
Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num))
)>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
#8 340.4 /usr/include/c++/10/chrono:473:154: required from here
#8 340.4 /usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
#8 340.4 428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept
#8 340.4 | ^~~~~~
#8 340.4 Please submit a full bug report,
#8 340.4 with preprocessed source if appropriate.
#8 340.4 See http://bugzilla.redhat.com/bugzilla for instructions.


Version-Release number of selected component (if applicable):

Ferdora 33, GCC 10.3.1

How reproducible:

Create Dockerfile as follows:

FROM fedora:33

RUN dnf install -y \
    dnf-plugins-core \
    git \
    g++ \
    cmake3 \
    python3-wheel \
    python3-pyyaml \
    python3-devel \
    python3-typing-extensions \
    wget

RUN dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora33/x86_64/cuda-fedora33.repo

RUN dnf install -y cuda-toolkit-11-2


Then 

docker build -t fedora33pt .
docker run -it fedora33pt /bin/bash
cd /tmp
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git checkout 82d245faef88c7aa5d5c47771801789c85a2165b # the version I used may or may not be necessary...
git submodule update --init --recursive
python3 setup.py bdist_wheel



Actual results:

internal compiler error: segmentation fault


Expected results:

no compiler error



Additional info:

Comment 1 Marek Polacek 2021-05-05 18:06:14 UTC
Is there a chance you can provide the preprocessed source file?  Just run the compiler incantation that fails with -save-temps and post the .ii file here, thanks.

Comment 2 Mike Neilly 2021-05-06 00:22:33 UTC
Created attachment 1780040 [details]
ii files for failing GCC 10/NVCC compile

Comment 3 Mike Neilly 2021-05-06 00:24:42 UTC
Created attachment 1780041 [details]
Preprocessed source generated by crash

Comment 4 Mike Neilly 2021-05-06 00:25:37 UTC
The command used to generate the .ii files.

[root@30a48f0aa8f1 gloo_cuda.dir]# pwd
/tmp/pytorch/build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir
[root@30a48f0aa8f1 gloo_cuda.dir]# /usr/local/cuda/bin/nvcc /tmp/pytorch/third_party/gloo/gloo/cuda_private.cu -c -o /tmp/pytorch/build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o -ccbin /usr/bin/cc -m64 -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=integer_sign_change,--diag_suppress=useless_using_declaration,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=implicit_return_from_non_void_function,--diag_suppress=unsigned_compare_with_zero,--diag_suppress=declared_but_not_referenced,--diag_suppress=bad_friend_decl -std=c++14 -Xcompiler -fPIC --expt-relaxed-constexpr --expt-extended-lambda -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=integer_sign_change -Xcudafe --diag_suppress=useless_using_declaration -Xcudafe --diag_suppress=set_but_not_used -DNVCC -I/usr/local/cuda/include -I/tmp/pytorch/cmake/../third_party/googletest/googlemock/include -I/tmp/pytorch/cmake/../third_party/googletest/googletest/include -I/tmp/pytorch/third_party/protobuf/src -I/tmp/pytorch/third_party/gemmlowp -I/tmp/pytorch/third_party/neon2sse -I/tmp/pytorch/third_party/XNNPACK/include -I/tmp/pytorch/cmake/../third_party/benchmark/include -I/tmp/pytorch/third_party -I/tmp/pytorch/cmake/../third_party/eigen -I/usr/include/python3.9 -I/tmp/pytorch/cmake/../third_party/pybind11/include -I/tmp/pytorch/third_party/gloo -I/tmp/pytorch/build/third_party/gloo -save-temps
/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154:   required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
      |                           ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/cceKnkJW.out file, please attach this to your bugreport.

Comment 5 Mike Neilly 2021-05-06 00:26:42 UTC
I do see there is a GCC issue filed on this failing on Ubuntu as well that I hadn't seen before: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240

Comment 6 Marek Polacek 2021-05-06 00:39:57 UTC
Great, thanks.  It's probably the same problem as in PR100240 though this ICE started with r231913.

Comment 7 Ben Cotton 2021-11-04 14:04:00 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Ben Cotton 2021-11-04 14:33:05 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Ben Cotton 2021-11-04 15:30:50 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Ben Cotton 2021-11-30 18:44:38 UTC
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.