Bug 1126199
Summary: | qemu is mis-linked on aarch64 when PIE+RELRO+combreloc | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||
Component: | binutils | Assignee: | Kyle McMartin <kmcmartin> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | agk, amit.shah, berrange, cfergeau, dwmw2, itamar, jakub, kmcmartin, nickc, pbonzini, pbrobinson, peterm, rjones, scottt.tw, virt-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | aarch64 | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | binutils-2.24-22.fc22 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-08-22 02:38:45 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 922257 | ||||||||
Attachments: |
|
Description
Richard W.M. Jones
2014-08-03 13:16:50 UTC
Created attachment 923631 [details]
cpus.o-no-opt.txt
Compiled code with no optimization (working).
Created attachment 923632 [details]
cpus.o-opt.txt
Compiled code with optimizations and PIE (not working).
I added this patch to qemu in Rawhide to temporarily work around the issue while we try to work out what's going on: http://pkgs.fedoraproject.org/cgit/qemu.git/commit/?id=a6c45000fe26a552c7f72ba90e5ebfb9d27ffb90 Kyle McMartin asked me to try -mtls-dialect=trad. However it crashes in the same place. Note that I'm only guessing that it's to do with TLS. It could be something completely different. The reproducer for this is as follows: Check out qemu from git. ./configure \ --target-list="aarch64-softmmu" \ --extra-cflags="-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches" \ --extra-ldflags="-Wl,-z,relro -Wl,-z,now" \ --enable-kvm make You will need a kernel (any kernel) whichis uncompressed, so do something like: zcat /boot/vmlinuz-3.WHATEVER.fc22.aarch64 > /tmp/vmlinux Then try to boot the kernel in qemu: gdb --args ./aarch64-softmmu/qemu-system-aarch64 -nodefaults -machine virt,accel=kvm -kernel /tmp/vmlinux -monitor none -serial stdio and gdb will catch the segfault. Note that I am using aarch64 host running Fedora Rawhide. http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2568981 http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2568986 can you try both these builds and let me know which work? I think I've narrowed the problem down, but it's a bit nasty. The bz1126199jkkm1 package: error: kvm run failed Bad address This error message causes abort() to be called so the process segfaults: (gdb) bt #0 0x0000007fb549d098 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x0000007fb549ee0c in __GI_abort () at abort.c:89 #2 0x0000005589d51f18 in kvm_cpu_exec (cpu=cpu@entry=0x558a7d1f60) at /usr/src/debug/qemu-2.1.0/kvm-all.c:1727 #3 0x0000005589d40dcc in qemu_kvm_cpu_thread_fn (arg=0x558a7d1f60) at /usr/src/debug/qemu-2.1.0/cpus.c:874 #4 0x0000007fb7d4604c in start_thread (arg=0x7fb2f38550) at pthread_create.c:312 #5 0x0000007fb554b590 in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:89 (gdb) frame 2 #2 0x0000005589d51f18 in kvm_cpu_exec (cpu=cpu@entry=0x558a7d1f60) at /usr/src/debug/qemu-2.1.0/kvm-all.c:1727 1727 abort(); (gdb) frame 3 #3 0x0000005589d40dcc in qemu_kvm_cpu_thread_fn (arg=0x558a7d1f60) at /usr/src/debug/qemu-2.1.0/cpus.c:874 874 r = kvm_cpu_exec(cpu); (gdb) print cpu $1 = (CPUState *) 0x558a7d1f60 (gdb) print *cpu $2 = { parent_obj = { parent_obj = { class = 0x558a7d1d90, free = 0x7fb7c1f564 <g_free>, properties = { tqh_first = 0x558a7c1960, tqh_last = 0x558a7e0038 }, ref = 2, parent = 0x558a7e6990 }, id = 0x0, realized = true, pending_deleted_event = false, opts = 0x0, hotplugged = 0, parent_bus = 0x0, gpios = { lh_first = 0x558a7e0440 }, child_bus = { lh_first = 0x0 }, num_child_bus = 0, instance_id_alias = -1, alias_required_for_version = 0 }, nr_cores = 1, nr_threads = 1, numa_node = 0, thread = 0x558a7ecb60, thread_id = 3556, host_tid = 0, running = false, halt_cond = 0x558a7ecb80, queued_work_first = 0x0, queued_work_last = 0x0, thread_kicked = false, created = true, stop = false, stopped = false, exit_request = 0, interrupt_request = 0, singlestep_enabled = 0, icount_extra = 0, jmp_env = {{ __jmpbuf = {0 <repeats 22 times>}, __mask_was_saved = 0, __saved_mask = { __val = {0 <repeats 16 times>} } }}, as = 0x558a291b28 <address_space_memory>, tcg_as_listener = 0x0, env_ptr = 0x558a7da218, current_tb = 0x0, tb_jmp_cache = {0x0 <repeats 4096 times>}, gdb_regs = 0x558a7ecb30, gdb_num_regs = 68, gdb_num_g_regs = 34, node = { tqe_next = 0x0, tqe_prev = 0x558a2231f0 <cpus> }, breakpoints = { tqh_first = 0x0, tqh_last = 0x558a7da1a8 }, watchpoints = { tqh_first = 0x0, tqh_last = 0x558a7da1b8 }, watchpoint_hit = 0x0, opaque = 0x0, mem_io_pc = 0, mem_io_vaddr = 0, kvm_fd = 10, kvm_vcpu_dirty = false, kvm_state = 0x558a7bfba0, kvm_run = 0x7fb7fd9000, cpu_index = 0, halted = 0, icount_decr = { u32 = 0, u16 = { low = 0, high = 0 } }, can_do_io = 0, exception_index = 0, tcg_exit_req = 0 } OK let's ignore the previous comment. I checked back with unoptimized qemu from git and that is now failing in the same way as above on this machine. This time with a working kernel. The bz1126199jkkm1 package works. The bz1126199jkkm2 package works. Spiffy, this is going to be fun to debug... Thanks Richard, just wanted to double check that you were seeing the same results, since the issue is weird. :) OK, it appears to be fixed with upstream binutils... I'll work on identifying a fix. A workaround for the moment is to set -Wl,-z,nocombreloc to avoid sorting .rela which seems to result in the right GOT entries for the TLS vars. regards, Kyle the fix is: commit f44a1f8e513b37bcc52ba9ea0c172c3e94852756 Author: Christophe Lyon <christophe.lyon> Date: Tue Jan 14 15:53:50 2014 +0100 2014-01-14 Michael Hudson-Doyle <michael.hudson> Kugan Vivekanandarajah <kugan.vivekanandarajah> bfd/ * elfnn-aarch64.c (elfNN_aarch64_final_link_relocate): Use correct offset while calculating relocation address. (elfNN_aarch64_create_small_pltn_entry): Likewise. (elfNN_aarch64_init_small_plt0_entry): Likewise. i'll commit it to binutils after i do a bit more testing. test results look good, pushed. Thanks Kyle! I have verified that a self-compiled binutils -22 fixes the problem for me. |