Bug 1300543

Summary: epel7 sipp-3.4.1 FTBFS on aarch64
Product: Red Hat Enterprise Linux 7 Reporter: Yaakov Selkowitz <yselkowi>
Component: binutilsAssignee: Nick Clifton <nickc>
Status: CLOSED ERRATA QA Contact: Miloš Prchlík <mprchlik>
Severity: medium Docs Contact: Tomas Capek <tcapek>
Priority: medium    
Version: 7.2CC: dmarlin, law, mcermak, mprchlik, nickc, ohudlick
Target Milestone: rc   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Better error message for *AArch64* For the *AArch64* target, if a program declared a global variable as a type smaller than an integer, but then referred to it in another file as if it were an integer, the linker could generate a confusing error message. This update fixes the error message, clearly identifying the cause and suggesting a possible reason for the error to the user.
Story Points: ---
Clone Of:
: 1306382 (view as bug list) Environment:
Last Closed: 2016-11-04 01:54:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1277314, 1285484, 1306382    
Attachments:
Description Flags
Better warning message about overflown relocs none

Description Yaakov Selkowitz 2016-01-21 06:17:45 UTC
Attempting to build sipp-3.4.1-1 from epel7 on RHELSA failed:

g++   -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches    -Wl,-z,relro  -o sipp src/sipp-actions.o src/sipp-auth.o src/sipp-comp.o src/sipp-call.o src/sipp-deadcall.o src/sipp-infile.o src/sipp-listener.o src/sipp-logger.o src/sipp-md5.o src/sipp-message.o src/sipp-milenage.o src/sipp-call_generation_task.o src/sipp-reporttask.o src/sipp-rijndael.o src/sipp-scenario.o src/sipp-sip_parser.o src/sipp-screen.o src/sipp-socket.o src/sipp-socketowner.o src/sipp-stat.o src/sipp-strings.o src/sipp-task.o src/sipp-time.o src/sipp-variables.o src/sipp-watchdog.o src/sipp-xp_parser.o src/sipp-sslinit.o src/sipp-sslthreadsafe.o  src/sipp-prepare_pcap.o src/sipp-send_packets.o src/sipp-rtpstream.o src/sipp-sipp.o  -lpcap -lsctp -lcrypto -lssl -lm -lpthread -ldl -lcurses
src/sipp-send_packets.o: In function `send_packets':
/builddir/build/BUILD/sipp-3.4.1/src/send_packets.c:147:(.text+0x370): relocation truncated to fit: R_AARCH64_LDST32_ABS_LO12_NC against symbol `media_ip_is_ipv6' defined in .bss section in src/sipp-sipp.o
/builddir/build/BUILD/sipp-3.4.1/src/send_packets.c:182:(.text+0x3f0): relocation truncated to fit: R_AARCH64_LDST32_ABS_LO12_NC against symbol `media_ip_is_ipv6' defined in .bss section in src/sipp-sipp.o
/builddir/build/BUILD/sipp-3.4.1/src/send_packets.c:224:(.text+0x4d8): relocation truncated to fit: R_AARCH64_LDST32_ABS_LO12_NC against symbol `media_ip_is_ipv6' defined in .bss section in src/sipp-sipp.o
/builddir/build/BUILD/sipp-3.4.1/src/send_packets.c:198:(.text+0x53c): relocation truncated to fit: R_AARCH64_LDST32_ABS_LO12_NC against symbol `media_ip_is_ipv6' defined in .bss section in src/sipp-sipp.o
collect2: error: ld returned 1 exit status
make: *** [sipp] Error 1 

Google seems to imply that this comes from a mismatch of types, which there indeed is:

include/sipp.hpp:extern bool               media_ip_is_ipv6;
src/send_packets.c:extern int media_ip_is_ipv6;

Nonetheless, the same version successfully built on Fedora and EPEL x86_64 as well as Fedora aarch64:

http://arm.koji.fedoraproject.org/koji/packageinfo?packageID=8720

Therefore this would appear to be an issue with binutils.

Comment 1 Nick Clifton 2016-01-29 11:32:37 UTC
The new 7.3 binutils is based upon the F23 binutils, so changing this BZ to MODIFIED so that QE can test/verify.

Comment 2 D. Marlin 2016-02-05 02:06:29 UTC
I tried building sipp using binutils-2.25.1-1.el7, but I get the same errors as before.  Do I need to try a different (or later) version?

Comment 3 Nick Clifton 2016-02-08 13:14:40 UTC
Hi Guys,

  I am reverting this BZ to assigned, since upgrading the binutils sources does not fix the problem.

  I have started an email thread about the problem on the binutils mailing list:

https://www.sourceware.org/ml/binutils/2016-02/msg00119.html

  Essentially it boils down to a test case like this:

file1.c:  extern int foo; int a (void) { return foo; }
file2.c:  char bar, foo, baz;

  When the function a() in file1,c tries to access the variable foo, it uses a 32-bit wide, 32-bit aligned access.  This fails because foo is actually 8-bits wide and 8-bit aligned, and, as it happens, not placed on a 32-bit aligned boundary.

  So the linker is correct in refusing the link the program, but it certainly could produce a more helpful error message.  I have created a patch to do this, which can be found in the email thread above.  If it proves to be acceptable I will backport it to the RHEL sources.

  The reason that this problem arises on the AArch64 architecture and not on other architectures is that the AArch64 has stricter alignment requirements on its load and store instructions.  In particular a 32-bit access will not encode the bottom two bits of the address to be accessed, so the address *must* be on a 32-bit boundary.  Other architectures may end up with slower accesses, or a run-time failure, but not a link time failure.

Comment 4 D. Marlin 2016-02-08 23:10:48 UTC
Since the same version of sipp builds for F23, 

  http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=327954

I tried another build in RHELSA using the F23 version of binutils (binaries straight from koji, without rebuilding it for RHELSA):

  http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=318616

but I still get the same error as before, so I'm trying to determine why it successfully builds for AArch64 on F23, and fails on RHELSA-7.2.

I can reproduce this on a local build system; succeeds in F23 mock chroot, fails in RHELSA-7.2 mock chroot, even if using the F23 binutils.

Could the issue be that some of the options used (flags for the compiler or linker) differ between F23 and RHELSA, or possibly a difference in the code the compiler generates for the linker?

Comment 5 Nick Clifton 2016-02-09 10:06:07 UTC
> Could the issue be that some of the options used (flags for the compiler or
> linker) differ between F23 and RHELSA, or possibly a difference in the code
> the compiler generates for the linker?

Yes.  Also another possibility is that the *placement* of the variables into the .data or .bss sections differs between the two installations.  This matters because the linker will only detect a problem if the code tries to make an unaligned access to the variable.  In the code example in comment #3, if you remove the declaration of the bar and baz variables the code will compile and link without any problems, because foo, even though it only needs byte alignment, will just happen to be placed onto a word aligned boundary.  (The compiled code will still be incorrect however.  Function a will still be loading foo as if it were a 32-bit quantity and not an 8-bit quantity).

Comment 6 Nick Clifton 2016-02-09 11:16:13 UTC
Created attachment 1122382 [details]
Better warning message about overflown relocs

This is the patch that I plan to check in to the RHEL 7.3 binutils, once they are
open for new fixes.

Comment 8 Miloš Prchlík 2016-06-27 12:06:59 UTC
Verified for build binutils-2.25.1-20.base.el7 - ld refuses to link the binary - which the expected behavior, as per #c3 - but provides more descriptive message.

Comment 12 errata-xmlrpc 2016-11-04 01:54:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2265.html