Description of problem: On 64-bit ARM, compiled utility programs waste disk space. For example, /bin/date has a useless 32 KiB chunk of zeroes, which is about 24% of its total disk usage of 135,632 bytes (residing in 136 KiB of filesystem blocks.) Version-Release number of selected component (if applicable): coreutils-8.27-16.fc27.aarch64 How reproducible: every time Steps to Reproduce: 1. readelf --segments /bin/date 2. od -Ax -tx4 -j 0x16dd0 /bin/date | sed 5q # start at end_text 3. ls -l /bin/date; du /bin/date Actual results: $ readelf --segments /bin/date Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000016dd0 0x0000000000016dd0 R E 0x10000 LOAD 0x000000000001ee90 0x000000000002ee90 0x000000000002ee90 0x0000000000001258 0x00000000000013c0 RW 0x10000 $ od -Ax -tx4 -j 0x16dd0 /bin/date | sed 10q 016dd0 00000000 00000000 00000000 00000000 * 01ee90 00003aa8 00000000 00003a60 00000000 01eea0 0002eea0 00000000 00012f28 00000000 01eeb0 00012f30 00000000 000118f0 00000000 $ ls -l /bin/date -rwxr-xr-x. 1 root root 135632 Aug 22 06:25 /bin/date $ du /bin/date 136 /bin/date # there are no "holes" The region from the end of .text to the beginning of .data, from file Offset 0x16dd0 to 0x1ee90, is all zero. That's 0x80c0 bytes. Quantizing to 4KiB pages in an ext4 root filesystem, the .text fragment ending at offset 0xdd0 can occupy the same page as the .data fragment starting at offset 0xe90. So there really are eight consecutive 4KiB filesystem pages that are all zero, and the 'du' shows that they are all there (there are no "holes" of demand-zero pages.) Expected results: Much less wasted space per compiled utility program. Additional info:
There is also the segment GNU_RELRO 0x000000000001ee90 0x000000000002ee90 0x000000000002ee90 0x0000000000001170 0x0000000000001170 R 0x1 which accounts for the virtual address space, but only [PT_]LOAD segments can affect the mapping between filesystem blocks and memory pages.
Much like GNU_RELRO tells ld-linux to remove PF_W after processing relocations, so can GNU_RELRO tell ld-linux to apply PF_W before processing relocations. Put the RELRO area at the high end of .text, and set PF_X in {GNU_RELRO}.p_flags so that ld-linux won't have to search to determine that when removing PF_W. Then space can be economized by the usual trick of mmap()ing one filesystem block twice (at the high end of the segment with lower .p_vaddr, and at the low end of the segment with the higher .p_vaddr) with different .p_flags. For true paranoia, then there should be three PT_LOADs: one PF_X|PF_R (.text), one PF_R only (but including RELRO), one PF_W|PF_R (.data). Many ElfXX_Shdr sections that lack SHF_EXECINSTR currently are placed in .text that has PF_X. That violates the principle of least privilege.
Those zeros definitely do not come from coreutils source code. They seem to be produced by binutils and I am afraid that intentionally. Bug #1267979 describes a similar (but more significant) issue with the mkdir binary. If space consumption is your concern, you can try installing coreutils-single as a replacement for coreutils.
The hole is completely intentional, so that the end of the GNU_RELRO segment is aligned on a page boundary. There are 2 page sizes that matter, one is the typical page size, which is what GNU_RELRO segment is aligned at, and another one, often larger than that, is the maximum page size for the given architecture.
Here is a file layout and runtime mapping strategy that satisfies the maximum page size alignment restriction yet minimizes fragmentation in the file. In particular the file fragmentation lost due to alignment is twice the size of a data cache line. The file generated by static linker /bin/ld is laid out: 0: ElfXX_Ehdr and ElfXX_Phdr anything that requires PROT_EXEC optionally, anything that does not require PROT_WRITE etext: .balign data_cache_line_size // for data cache efficiency relro: contents of PT_GNU_RELRO segment erelro: .balign data_cache_line_size // for data cache efficiency data_p: // before the .loc pseudo op // generate no physical bytes, but change the logical address ("location counter") .loc UP(MPS, data_p) // same page-relative position on next logical page data: // Notice that (data - data_p) is a multiple of MPS. anything that requires PROT_WRITE edata: anything not needed for execution of instructions (debug info, etc.) where MPS= maximum page size. The function UP(page, addr): ((page - 1) & addr) ? (page + addr) : addr moves to the same offset on the next page. The pseudo-op .loc changes the logical address without changing the physical address (no physical bytes are generated into the file.) The ElfXX_Phdr memory mapping at run time is: .p_type .p_offset .p_vaddr .p_memsz .p_flags .p_align PT_LOAD 0 0 erelro - 0 PF_X|PF_R MPS PT_LOAD data_p data edata - data PF_W|PF_R MPS PT_GNU_RELRO relro relro erelro - relro PF_X|PF_R NBPW The relocation processing done by the runtime loader ld-linux is: If PT_GNU_RELRO is present { mprotect(.p_vaddr, .p_memsz, PROT_WRITE|PROT_READ) // for PT_GNU_RELRO } Process relocations. If PT_GNU_RELRO is present { mprotect(.p_vaddr, .p_memsz, PROT_from_PF(.pflags)) // for PT_GNU_RELRO } The number of physical pages required at execution depends only on the number of page boundaries crossed by the PT_GNU_RELRO segment. Of course this is minimized if one end is aligned to a page boundary, but for a large maximum page size (such as 64 KiB) and small {RELRO}.p_memsz (such as a few hundred pointers or fewer) then many placements not adjacent to a page boundary also do not cross a page boundary. As long as RELRO contents do not cross an extra page boundary, then placing RELRO inside PT_LOAD[0] requires no more physical pages at runtime (for the sum total of address spaces for any number of processes that execve() the same file) than placing RELRO inside PT_LOAD[1]. The per-process cost of RELRO in physical pages is 1 more than the number of page boundaries it crosses, independent of which PT_LOAD contains RELRO (including if RELRO has a [third] PT_LOAD all to itself.)
The "go" static linker already generates three PT_LOAD (as suggested in Comment 2). $ readelf --segments foo.go # x86_64 Elf file type is EXEC (Executable file) Entry point 0x2034f0 There are 6 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000200040 0x0000000000200040 0x0000000000000150 0x0000000000000150 R 8 LOAD 0x0000000000000000 0x0000000000200000 0x0000000000200000 0x000000000000230f 0x000000000000230f R 1000 LOAD 0x0000000000003000 0x0000000000203000 0x0000000000203000 0x0000000000014704 0x0000000000014704 R E 1000 LOAD 0x0000000000018000 0x0000000000218000 0x0000000000218000 0x000000000001c7a8 0x000000000001d018 RW 1000 GNU_RELRO 0x0000000000034000 0x0000000000234000 0x0000000000234000 0x00000000000007a8 0x0000000000001000 R 1 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0 Section to Segment mapping: Segment Sections... 00 01 .rodata 02 .text 03 .data .data.rel.ro .bss 04 .data.rel.ro 05
Hi John, > Here is a file layout and runtime mapping strategy that satisfies the > maximum page size alignment restriction yet minimizes fragmentation in the > file. This is an enhancement request that I think would be best served by being implemented first in the FSF binutils and then brought in to Fedora. To that end would you mind filing a binutils enhancement request here: https://sourceware.org/bugzilla/enter_bug.cgi?product=binutils Thanks. Cheers Nick
This message is a reminder that Fedora 27 is nearing its end of life. On 2018-Nov-30 Fedora will stop maintaining and issuing updates for Fedora 27. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '27'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 27 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.