Bug 1918924 - Clang is broken on armv7hl
Summary: Clang is broken on armv7hl
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: clang
Version: 34
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Tom Stellard
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1943738 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-21 17:28 UTC by Vitaly
Modified: 2021-10-02 01:27 UTC (History)
10 users (show)

Fixed In Version: binutils-2.35.2-6.fc34
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-17 08:00:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Failing object file (2.29 KB, application/octet-stream)
2021-04-26 21:43 UTC, Tom Stellard
no flags Details
Proposed patch (41.38 KB, patch)
2021-05-21 14:37 UTC, Nick Clifton
no flags Details | Diff

Description Vitaly 2021-01-21 17:28:43 UTC
Description of problem:
Clang is broken on armv7hl architecture.

Version-Release number of selected component (if applicable):
clang 11.0.1-4.fc34
gcc 11.0.0-0.16.fc34

How reproducible:
Always.

Steps to Reproduce:
1. Try to build nheko package in Rawhide for armv7hl.
2.
3.

Actual results:
CMake Warning (dev) at /usr/share/cmake/Modules/GNUInstallDirs.cmake:223 (message):
  Unable to determine default CMAKE_INSTALL_LIBDIR directory because no
  target architecture is known.  Please enable at least one language before
  including GNUInstallDirs.
Call Stack (most recent call first):
  CMakeLists.txt:75 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.
-- The CXX compiler identification is Clang 11.0.1
-- The C compiler identification is Clang 11.0.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - failed
-- Check for working CXX compiler: /usr/bin/clang++
-- Check for working CXX compiler: /usr/bin/clang++ - broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCXXCompiler.cmake:59 (message):
  The C++ compiler
    "/usr/bin/clang++"
  is not able to compile a simple test program.
  It fails with the following output:
    Change Dir: /builddir/build/BUILD/nheko-0.8.0/armv7hl-redhat-linux-gnueabi/CMakeFiles/CMakeTmp
    
    Run Build Command(s):/usr/bin/ninja-build cmTC_87320 && [1/2] Building CXX object CMakeFiles/cmTC_87320.dir/testCXXCompiler.cxx.o
    [2/2] Linking CXX executable cmTC_87320
    FAILED: cmTC_87320 
    : && /usr/bin/clang++ -O2 -flto -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong   -march=armv7-a -mfpu=vfpv3-d16 -mtune=generic-armv7-a -mabi=aapcs-linux -mfloat-abi=hard -Wl,-z,relro -Wl,--as-needed  -Wl,-z,now  -flto CMakeFiles/cmTC_87320.dir/testCXXCompiler.cxx.o -o cmTC_87320   && :
    clang-11: error: unable to execute command: Segmentation fault (core dumped)
    clang-11: error: linker command failed due to signal (use -v to see invocation)
    ninja: build stopped: subcommand failed.
    
    
  
  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:80 (project)
-- Configuring incomplete, errors occurred!
See also "/builddir/build/BUILD/nheko-0.8.0/armv7hl-redhat-linux-gnueabi/CMakeFiles/CMakeOutput.log".
See also "/builddir/build/BUILD/nheko-0.8.0/armv7hl-redhat-linux-gnueabi/CMakeFiles/CMakeError.log".
RPM build errors:
error: Bad exit status from /var/tmp/rpm-tmp.zRVNb8 (%build)
    Bad exit status from /var/tmp/rpm-tmp.zRVNb8 (%build)
Child return code was: 1

Expected results:
Successful build.

Additional info:
Koji task: https://koji.fedoraproject.org/koji/taskinfo?taskID=60156825

Comment 1 Tom Stellard 2021-01-21 18:52:25 UTC
I can reproduce this failure locally using mock's forcearch feature:

fedpkg clone nheko
cd nheko
fedpkg mockbuild --root fedora-rawhide-armhfp

Comment 2 Tom Stellard 2021-01-22 03:30:31 UTC
The segmentation fault is actually coming from the linker (ld.bfd) and not clang.  I haven't been able to figure out how to get a stacktrace inside QEMU, so I can't tell if it's a bug in ld.bfd or a bug in the LLVM gold plugin.  There are 2 ways you can work around this issue:

1. If you want to continue using LTO, you can use lld as the linker.  This means adding BuildRequires: lld and updating the linker flags, like this:
%global build_ldflags %(echo %{build_ldflags} -fuse-ld=lld)

2. If you want to keep using ld.bfd, you can just disable LTO:
%global _lto_cflags %{nil}"

Comment 3 Vitaly 2021-01-22 14:55:18 UTC
Added a temporary workaround to the nheko package:

%ifarch %{arm}
%global _lto_cflags %{nil}
%endif

Comment 4 Nick Clifton 2021-01-22 15:17:47 UTC
(In reply to Tom Stellard from comment #2)
> The segmentation fault is actually coming from the linker (ld.bfd) and not
> clang.  I haven't been able to figure out how to get a stacktrace inside
> QEMU, so I can't tell if it's a bug in ld.bfd or a bug in the LLVM gold
> plugin.

Given that compiling without LTO enabled makes the bug go away, I would guess
that the problem is the gold plugin.  This is actually part of gcc, rather
than the binutils, so maybe the recent update of gcc to 11.0.0-0.6 might be
the cause.  But it could also be the gold linker's use of the plugin that
is the cause.  There is a scratch build available of a new binutils built 
from the forthcoming 2.36 release which you might like to try.  (Note - this 
scratch build is not going to be turned into an official update of the
rawhide binutils until after the F34 branch is created).

 https://koji.fedoraproject.org/koji/taskinfo?taskID=60208987

Comment 5 Ben Cotton 2021-02-09 15:42:07 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle.
Changing version to 34.

Comment 6 Jun Aruga 2021-03-28 20:56:48 UTC
> The segmentation fault is actually coming from the linker (ld.bfd) and not clang.  I haven't been able to figure out how to get a stacktrace inside QEMU, so I can't tell if it's a bug in ld.bfd or a bug in the LLVM gold plugin.  There are 2 ways you can work around this issue:

@Tom Stellard Yon might be able to ask someone using a native armv7 (ARM 32-bit) machine in Fedora.
See https://fedoraproject.org/wiki/Test_Machine_Resources_For_Package_Maintainers

Comment 7 Tom Stellard 2021-04-10 00:54:32 UTC
*** Bug 1943738 has been marked as a duplicate of this bug. ***

Comment 8 Tom Stellard 2021-04-24 03:09:45 UTC
I was able to get a backtrace for the crash:

#0  elf32_arm_output_arch_local_syms (output_bfd=output_bfd@entry=0xff2a7188, info=info@entry=0xff2a7188, flaginfo=flaginfo@entry=0xffee6014, func=<optimized out>)
    at elf32-arm.c:18291
18291            if (local_iplt[i] != NULL
(gdb) bt
#0  elf32_arm_output_arch_local_syms (output_bfd=output_bfd@entry=0xff2a7188, info=info@entry=0xff2a7188, flaginfo=flaginfo@entry=0xffee6014, func=<optimized out>)
    at elf32-arm.c:18291
#1  0xff5ea8b8 in bfd_elf_final_link (abfd=<optimized out>, info=<optimized out>) at elflink.c:12828
#2  0xff5b8878 in elf32_arm_final_link (abfd=0xff2a7188, info=0xfffeebb8 <link_info>) at elf32-arm.c:13745
#3  0xffeef090 in ldwrite () at ldwrite.c:545
#4  main (argc=<optimized out>, argv=<optimized out>) at ./ldmain.c:512

Comment 9 Nick Clifton 2021-04-26 11:43:35 UTC
(In reply to Tom Stellard from comment #8)
Hi Tom,

  Which version of the binutils was installed ?

  
> #0  elf32_arm_output_arch_local_syms
> (output_bfd=output_bfd@entry=0xff2a7188, info=info@entry=0xff2a7188,
> flaginfo=flaginfo@entry=0xffee6014, func=<optimized out>)
>     at elf32-arm.c:18291
> 18291            if (local_iplt[i] != NULL

  Are you able to collect the object files and libraries involved in this link ?
  And the linker command line too, if possible.  Then I can attempt to reproduce the failure.

Cheers
  Nick

Comment 10 Tom Stellard 2021-04-26 21:43:56 UTC
Created attachment 1775725 [details]
Failing object file

The object file is actually LLVM bitcode (used for LTO), so I don't know if this helpful, but you can reproduce with this command:

/usr/bin/ld  -o main /usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/../../../../lib/crt1.o /usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/../../../../lib/crti.o /usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/crtbegin.o -L/usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11 -L/usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/../../../../lib -L/usr/bin/../lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/../../.. -L/usr/bin/../lib -L/lib -L/usr/lib -plugin /usr/bin/../lib/LLVMgold.so  clang-lto-bfd-arm-crash.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/crtend.o /usr/lib/gcc/armv7hl-redhat-linux-gnueabi/11/../../../../lib/crtn.o

Comment 11 Tom Stellard 2021-04-26 23:34:31 UTC
By using slightly different input files, I was able to get bfd to abort(), maybe this is more helpful:

<mock-chroot> sh-5.1# ld.bfd --version
GNU ld version 2.36.1-8.fc35
Copyright (C) 2021 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

<mock-chroot> sh-5.1# cat main.c
void hello();
  
int main(int argc, char **argv) {
  hello();
  return 0;
}
<mock-chroot> sh-5.1# cat hello.c
#include <stdio.h>

void hello() {
  printf("Hello World\n");
}
<mock-chroot> sh-5.1# clang -flto -o main main.c hello.c
/usr/bin/ld: BFD version 2.36.1-8.fc35 internal error, aborting at elfcode.h:224 in bfd_elf32_swap_symbol_out

/usr/bin/ld: Please report this bug.

clang-12: error: linker command failed with exit code 1 (use -v to see invocation)

Comment 15 Nick Clifton 2021-05-21 14:37:43 UTC
Created attachment 1785606 [details]
Proposed patch

Hi Tom,

  Please could you try this patch.  It has been made against the latest GNU Binutils sources, so it may not apply if you are using some other version, but if you let me know which version that is I can generate a new patch.

  The patch will probably not solve the problem, but it should prevent the illegal memory access, and it might even generate a helpful error message.  I still think that the LTO plugin is to blame, but I am not sure where a full fix should be applied.

Cheers
  Nick

Comment 16 Robert-André Mauchin 🐧 2021-08-18 18:01:33 UTC
Any update on this pretty plz?

Comment 17 Tom Stellard 2021-09-11 06:24:39 UTC
@nickc The Fedora arm test machines are back online now: https://fedoraproject.org/wiki/Test_Machine_Resources_For_Package_Maintainers

Comment 18 Tom Stellard 2021-09-11 06:34:38 UTC
Also, I just did a build of binutils-2.37 on the test machine and the segfault is gone, but I can still reproduce it in the emulated mock chroot.  I wonder if maybe one of the compile flags we use in Fedora is causing some kind of mis-compile.

Comment 19 Nick Clifton 2021-09-13 15:00:32 UTC
(In reply to Tom Stellard from comment #18)
Hi Tom,

> Also, I just did a build of binutils-2.37 on the test machine and the
> segfault is gone, but I can still reproduce it in the emulated mock chroot. 

Was that a Fedora 34 mock chroot ?  F34 binutils is based on 2.35.2, so possibly this is a bug that has been fixed between 2.35.2 and 2.37.

> I wonder if maybe one of the compile flags we use in Fedora is causing some
> kind of mis-compile.

If you configured the 2.37 binutils without the "--enable-plugins" option then this might explain why it worked.

Cheers
  Nick

Comment 20 Tom Stellard 2021-09-13 17:12:09 UTC
(In reply to Nick Clifton from comment #19)
> (In reply to Tom Stellard from comment #18)
> Hi Tom,
> 
> > Also, I just did a build of binutils-2.37 on the test machine and the
> > segfault is gone, but I can still reproduce it in the emulated mock chroot. 
> 
> Was that a Fedora 34 mock chroot ?  F34 binutils is based on 2.35.2, so
> possibly this is a bug that has been fixed between 2.35.2 and 2.37.
> 

It was a rawhide mock chroot.

> > I wonder if maybe one of the compile flags we use in Fedora is causing some
> > kind of mis-compile.
> 
> If you configured the 2.37 binutils without the "--enable-plugins" option
> then this might explain why it worked.
> 

I just checked and I did configure with --enable-plugins.

> Cheers
>   Nick

Comment 21 Tom Stellard 2021-09-14 04:35:52 UTC
On my local builds of binutils, if I drop the -flto=auto flag when building binutils, then the problem goes away.  I'm going to do a scratch build of binutils to confirm that this works in the RPM build too.

Comment 22 Nick Clifton 2021-09-14 11:00:58 UTC
(In reply to Tom Stellard from comment #21)
> On my local builds of binutils, if I drop the -flto=auto flag when building
> binutils, then the problem goes away.  

This is worrying as I thought that building with LTO enabled was getting to be reliable these days.

If you can confirm the workaround however then I will update the binutils.spec file to disable LTO when building the binutils for the ARM architecture.

Comment 23 Tom Stellard 2021-09-14 23:09:35 UTC
I can confirm that this pull request fixes the issue: https://src.fedoraproject.org/rpms/binutils/pull-request/29

Comment 24 Nick Clifton 2021-09-15 13:34:43 UTC
In which case how about we CLOSE this BZ ? :-)

Comment 25 Tom Stellard 2021-09-16 12:39:52 UTC
(In reply to Nick Clifton from comment #24)
> In which case how about we CLOSE this BZ ? :-)

Sure, unless you want to link it to a Bodhi update.

Comment 26 Fedora Update System 2021-09-17 07:57:07 UTC
FEDORA-2021-566b3f3e1b has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-566b3f3e1b

Comment 27 Fedora Update System 2021-09-17 07:58:28 UTC
FEDORA-2021-74394f7f1a has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-74394f7f1a

Comment 28 Nick Clifton 2021-09-17 08:00:27 UTC
Fixed in binutils-2.35.2-6.fc34
Fixed in binutils-2.37-10.fc35 
Fixed in  binutils-2.37-12.fc36

Comment 29 Fedora Update System 2021-09-17 15:19:52 UTC
FEDORA-2021-566b3f3e1b has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-566b3f3e1b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-566b3f3e1b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 30 Fedora Update System 2021-09-17 19:41:25 UTC
FEDORA-2021-74394f7f1a has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-74394f7f1a`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-74394f7f1a

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 31 Fedora Update System 2021-09-29 00:16:44 UTC
FEDORA-2021-74394f7f1a has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 32 Fedora Update System 2021-10-02 01:27:14 UTC
FEDORA-2021-566b3f3e1b has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.