Hide Forgot
Description of problem: Explicitly running ld-linux-x86-64.so.2 on an ELF executable file that lacks PT_DYNAMIC causes a SIGSEGV in ld-linux. Version-Release number of selected component (if applicable): glibc-2.24-3.fc25.x86_64 How reproducible: every time Steps to Reproduce: 1. Create an ELF executable file that lacks PT_DYNAMIC, such as by running 'upx' on a copy of /bin/date. 2. /lib64/ld-linux-x86-64.so.2 date.upx 3. Actual results: Program received signal SIGSEGV, Segmentation fault. _dl_relocate_object (scope=0x55555577b488, reloc_mode=<optimized out>, consider_profiling=consider_profiling@entry=0) at dl-reloc.c:232 232 const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]); (gdb) bt #0 _dl_relocate_object (scope=0x55555577b488, reloc_mode=<optimized out>, consider_profiling=consider_profiling@entry=0) at dl-reloc.c:232 #1 0x0000555555558051 in dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:2066 #2 0x000055555556d91f in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7fffffffdfa0, dl_main=dl_main@entry=0x5555555559a0 <dl_main>) at ../elf/dl-sysdep.c:249 #3 0x0000555555558f68 in _dl_start_final (arg=0x7fffffffdfa0) at rtld.c:305 #4 _dl_start (arg=0x7fffffffdfa0) at rtld.c:411 #5 0x0000555555554cd8 in _start () (gdb) x/i $pc => 0x5555555604a1 <_dl_relocate_object+161>: mov 0x8(%rax),%rax (gdb) l 227 228 { 229 /* Do the actual relocation of the object's GOT and other data. */ 230 231 /* String table object symbols. */ =>232 const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]); 233 234 /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code. */ 235 #define RESOLVE_MAP(ref, version, r_type) \ 236 ((ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL \ Expected results: no SIGSEGV from ld-linux. Additional info: Here is a patch which fixes this specific case. In general this patch is not sufficient because individual elements such as DT_STRTAB could be omitted from the PT_DYNAMIC, but the patch does handle the case when there is no PT_DYNAMIC at all. --- rtld.c 2016-08-18 05:12:39.000000000 -0700 +++ /tmp/rtld.c 2016-10-03 09:15:09.989481696 -0700 @@ -2062,7 +2062,7 @@ /* Also allocated with the fake malloc(). */ l->l_free_initfini = 0; - if (l != &GL(dl_rtld_map)) + if (l != &GL(dl_rtld_map) && l->l_ld) _dl_relocate_object (l, l->l_scope, GLRO(dl_lazy) ? RTLD_LAZY : 0, consider_profiling); ===== For the general case it is necessary to guard every instance of D_PTR with something such as --- sysdeps/generic/ldsodefs.h 2016-08-18 03:00:11.000000000 -0700 +++ /tmp/ldsodefs.h 2016-10-03 09:23:51.737284490 -0700 @@ -58,9 +58,9 @@ most architectures the entry is already relocated - but for some not and we need to relocate at access time. */ #ifdef DL_RO_DYN_SECTION -# define D_PTR(map, i) ((map)->i->d_un.d_ptr + (map)->l_addr) +# define D_PTR(map, i) (!(map)->i ? NULL : ((map)->i->d_un.d_ptr + (map)->l_addr)) #else -# define D_PTR(map, i) (map)->i->d_un.d_ptr +# define D_PTR(map, i) (!(map)->i ? NULL : ((map)->i->d_un.d_ptr)) #endif /* Result of the lookup functions and how to retrieve the base address. */ =====
Why is anything needed? There are lots of ways how you can construct invalid ELF objects, the dynamic linker IMO doesn't need to waste time to avoid crashing on garbage. You are trying to run it, so there is always possibility to crash it, e.g. just by doing something in the executable that will crash.
Such a ELF executable is valid; it just happens to need no relocation or other assistance from ld-linux, so in particular there is no PT_DYNAMIC. There is no problem running the executable directly using execve(). ld-linux should not cause SIGSEGV when running it.
(In reply to John Reiser from comment #2) > Such a ELF executable is valid; it just happens to need no relocation or > other assistance from ld-linux, so in particular there is no PT_DYNAMIC. > There is no problem running the executable directly using execve(). > ld-linux should not cause SIGSEGV when running it. ELF validity is a question that can only be answered given a specific use case. Your use case appears to be upx so let us go with that. In Fedora Rawhide upx appears broken. In fact the test case of upx should never involve the dynamic loader because without a PT_INTERP the kernel runs the executable directly. rpm -qa | grep upx upx-3.91-7.fc24.x86_64 cp /bin/date . upx ./date eu-readelf -a -W ./date eu-readelf: failed reading './date': (null) Using binutils readelf (tolerant of garbage) we see: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] <no-name> NULL 000120d0000120d0 b900000238 ff21fbf600000002 19901740760ecb6f 1179403647 65794 1094902530871001091 [ 1] <no-name> LOUSER+0x5d602c09 f88f21764b072705 6ed9b23c00670801 1c070f02380304 e8ab7ecdb2133720 WAXMSIOTxxxoxxxxxxxxx 2244480706 1281 2249689869432521 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) Total garbage. It contains a real PT_DYNAMIC, but the binary has been completely mangled by upx without correctly updating the ELF information, and so it crashes. In Fedora 23 it works just fine: cp /bin/date ./date upx ./date ./date Fri Oct 7 02:25:30 EDT 2016 Copying the resulting F23 upx built binary to Rawhide: ./date Thu Oct 6 23:26:11 PDT 2016 It also works fine, showing there is no problem. And the dynamic loader is smart enough _not_ to load it (combination of ET_DYN without PT_DYNAMIC): /lib64/ld-linux-x86-64.so.2 ./date ./date: error while loading shared libraries: ./date: object file has no dynamic section So for now this looks like a upx bug. Do you have any other reproducers you want to talk about?
Here is a reproducer of the same problem on i686: ===== upxtest.c int const x[10000] = {1,2,3}; /* highly compressible, and in .text */ int main(int argc, char *argv[]) { return x[1]; } ===== # dnf install glibc.i686 glibc-devel.i686 libgcc.i686 # i686 runtime support $ gcc -m32 -o upxtest.i686 upxtest.c $ upx upxtest.i686 $ gdb /lib/ld-linux.so.2 (gdb) run ./upxtest.i686 Starting program: /usr/lib/ld-linux.so.2 ./upxtest.i686 Program received signal SIGSEGV, Segmentation fault. _dl_relocate_object (scope=0x5657bab8, reloc_mode=1, consider_profiling=0) at dl-reloc.c:232 232 const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]); (gdb) l 227 228 { 229 /* Do the actual relocation of the object's GOT and other data. */ 230 231 /* String table object symbols. */ 232 const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]); 233 234 /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code. */ 235 #define RESOLVE_MAP(ref, version, r_type) \ 236 ((ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL \ (gdb) bt #0 _dl_relocate_object (scope=0x5657bab8, reloc_mode=1, consider_profiling=0) at dl-reloc.c:232 #1 0x565589e0 in dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:2066 #2 0x5656ddd9 in _dl_sysdep_start (start_argptr=0xffffd120, dl_main=0x565566e0 <dl_main>) at ../elf/dl-sysdep.c:249 #3 0x56559ec2 in _dl_start_final (arg=0xffffd120) at rtld.c:305 #4 _dl_start (arg=<optimized out>) at rtld.c:411 #5 0x56555ad7 in _start () (gdb) x/i $pc => 0x56560f14 <_dl_relocate_object+132>: mov 0x4(%eax),%eax (gdb) p/x $eax $1 = 0x0 [The corresponding case for x86_64 currently fails to reproduce _this_ problem because of a dispute over the validity of (PT_LOAD.p_offset > .st_size) when (0 == .p_memsz). upx uses (PT_LOAD[data_segment].p_memsz == 0) to set the value (brk(0) = .p_vaddr), and the problem arises because for small files (.p_align > .st_size) because .p_align is 0x200000 (2MiB) on x86_64. upx-3.92 will work-around the issue by reducing .p_align for .data to 0x1000, which will be OK because .p_align for .text remains at 2MiB.]
(In reply to Carlos O'Donell from comment #3. This reply provides background information, but should not derail attention to the SIGSEGV in ld-linux.) > eu-readelf -a -W ./date > eu-readelf: failed reading './date': (null) > > Using binutils readelf (tolerant of garbage) we see: > > Section Headers: > [Nr] Name Type Address Off Size ES > Flg Lk Inf Al > [ 0] <no-name> NULL 000120d0000120d0 b900000238 > ff21fbf600000002 19901740760ecb6f 1179403647 65794 1094902530871001091 > [ 1] <no-name> LOUSER+0x5d602c09 f88f21764b072705 6ed9b23c00670801 > 1c070f02380304 e8ab7ecdb2133720 WAXMSIOTxxxoxxxxxxxxx 2244480706 1281 > 2249689869432521 > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > l (large), p (processor specific) > > Total garbage. The reason for "total garbage" is that libbfd and its clients do not grok (0 == .e_shnum). An ELF file contains two views of the program: the execution view (the ElfXX_Phdr) and the linking view (the ElfXX_Shdr). The operating system pays attention to the Phdr only; the Shdr are ignored by execve(). Speaking charitably, the readelf in binutils has good intentions of providing as much information as possible: trying to interpret .e_shoff as the start of ElfXX_Shdr table even though (0==.e_shnum). However when (0==.e_shnum) then the printout of the "extra" Shdr should contain a warning. upx sets (.e_shnum = 0) because upx aims for small file size of executables, and the Shdr are ignored by execution. upx used to set (.e_shoff = 0) also, but previous versions of libbfd had an even worse understanding of that. Incidentally, gdb cannot be used to debug some runnable programs because gdb relies on libbfd, and libbfd balks when the linking view (Shdrs) is not to its liking, even though the operating system digests the execution view (Phdrs) just fine. Of course when (0==.e_shnum) then I don't expect symbols from the debugger, but I do expect register values, program counter, backtraces using the frame pointer, breakpoints and single stepping by instruction, etc. I'll file a bugreport about that later.
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.