Bug 2059838 - kernel can't be built because of missing STT_SECTION section symbol in the symbol table for the ".text.unlikely" section
Summary: kernel can't be built because of missing STT_SECTION section symbol in the sy...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: binutils
Version: 35
Hardware: powerpc
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: Nick Clifton
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-02 07:30 UTC by Coiby
Modified: 2022-12-13 16:48 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 16:48:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
f34, f35 and rhel9 has the same kexec_file.s (42.81 KB, text/plain)
2022-03-03 14:41 UTC, Coiby
no flags Details

Description Coiby 2022-03-02 07:30:20 UTC
Description of problem:

With ftrace enabled, linux/script/recordmcount (this binary is built after recordmcount.c) needs to have access to the .text.unlikely section symbol in the symbol table in order to build the __mcount_loc section when there is no global variable for reference,
$ readelf -a kernel/kexec_file.o
Relocation section '.rela__mcount_loc' at offset 0x70e0 contains 20 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
...
000000000090  000c00000026 R_PPC64_ADDR64    0000000000000000 .text.unlikely + c
000000000098  000c00000026 R_PPC64_ADDR64    0000000000000000 .text.unlikely + 4c


Without this STT_SECTION section symbol, recordmcount would quit with the following error,
Cannot find symbol for section 11: .text.unlikely.

Thus the kernel building process can't continue.


Version-Release number of selected component (if applicable):

[root@ibm-p9b-18 kernel-ark]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/ppc64le-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: ppc64le-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=p
osix --enable-checking=release --enable-targets=powerpcle-linux --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-has
h-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20220127/obj-ppc64le-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-secureplt --wi
th-long-double-128 --with-cpu-32=power8 --with-tune-32=power8 --with-cpu-64=power8 --with-tune-64=power8 --build=ppc64le-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC) 


How reproducible:

always

Steps to Reproduce:
1. use a kernel .config file from https://gitlab.com/api/v4/projects/18194050/jobs/2142072138/artifacts/artifacts/kernel-mainline.kernel.org-ppc64le-7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3.config
2. make


Actual results:

The kernel fails to build.

Expected results:

The kernel is built succesfully.

Additional info:

1. RHEL9 has the same gcc version (11.2.1 20220127) as F35. But somehow the built kernel/kexec_file.o has the ".text.unlikely" section symbol,
$ readelf -a kernel/kexec_file.o
...
Relocation section '.rela.text.unlikely' at offset 0x5920 contains 12 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  0026000000fc R_PPC64_REL16_HA  0000000000000000 .TOC. + 0
000000000004  0026000000fa R_PPC64_REL16_LO  0000000000000000 .TOC. + 4
00000000000c  00270000000a R_PPC64_REL24     0000000000000000 _mcount + 0
...
Symbol table '.symtab' contains 118 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS kexec_file.c
      ...
    12: 0000000000000000     0 SECTION LOCAL  DEFAULT   11 .text.unlikely


2.  For F34 x86_64, kernel/kexec_file.o also has the ".text.unlikely" section symbol.

Comment 1 Jakub Jelinek 2022-03-02 13:44:50 UTC
You didn't say which exact kernel you're building.
Perhaps best would be if you could just attach preprocessed kexec_file.i and full gcc command line used to compile that file (add -save-temps to it to get the preprocessed source).

Comment 2 Coiby 2022-03-03 00:40:31 UTC
Created attachment 1863914 [details]
kexec_file.[io] for RHEL9 and F35

(In reply to Jakub Jelinek from comment #1)
> You didn't say which exact kernel you're building.

I used commit 11bff7a9736973ab3482fc0cfdae3c4eccdea8ff ("[redhat] kernel-5.17.0-0.rc5.2293be58d6a1.107") of kernel-ark. But this issue has been there for around 4 moths according to [1].

> Perhaps best would be if you could just attach preprocessed kexec_file.i and
> full gcc command line used to compile that file (add -save-temps to it to
> get the preprocessed source).

Thanks for this info! The full gcc commandline is 

$ gcc -save-temps  -Wp,-MMD,kernel/.kexec_file.o.d -nostdinc -I./arch/powerpc/include -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -I ./arch/powerpc -DHAVE_AS_ATHIGH=1 -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -mlittle-endian -m64 -msoft-float -pipe -mtraceback=no -mabi=elfv2 -mcmodel=medium -mno-pointers-to-nested-functions -mcpu=power8 -mno-altivec -mno-vsx -fno-asynchronous-unwind-tables -mno-string -Wa,-maltivec -Wa,-mpower4 -Wa,-many -mno-strict-align -mlittle-endian -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -O2 -fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector -Wimplicit-fallthrough=5 -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-stack-clash-protection -pg -mprofile-kernel -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-alloc-size-larger-than -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -mstack-protector-guard-offset=2816    -DKBUILD_MODFILE='"kernel/kexec_file"' -DKBUILD_BASENAME='"kexec_file"' -DKBUILD_MODNAME='"kexec_file"' -D__KBUILD_MODNAME=kmod_kexec_file -c -o kernel/kexec_file.o kernel/kexec_file.c 

As attached are the kexec_file.[io] files for RHEL9 and F35 respectively.  The kexec_file.i files are the same but kexec_file.o files are different i.e. missing the section symbol for the text.unlikely section for F35.

[1] https://datawarehouse.cki-project.org/issue/776

Comment 3 Jakub Jelinek 2022-03-03 09:41:47 UTC
Thanks, though I'm confused on what to look for.
Looking at the kexec_file.o files you've uploaded, I see no .rela.__mcount_loc section and the only spots that mention .text.unlikely in readelf -Wa are:
readelf -Wa */kexec_file.o | grep .text.unlikely
  [11] .text.unlikely    PROGBITS        0000000000000000 0029c0 000080 00  AX  0   0  4
  [12] .rela.text.unlikely RELA            0000000000000000 005830 000120 18   I 30  11  8
Relocation section '.rela.text.unlikely' at offset 0x5830 contains 12 entries:
  [11] .text.unlikely    PROGBITS        0000000000000000 0029c0 000080 00  AX  0   0  4
  [12] .rela.text.unlikely RELA            0000000000000000 005920 000120 18   I 30  11  8
Relocation section '.rela.text.unlikely' at offset 0x5920 contains 12 entries:
It is true that rhel9 has far more SECTION symbols:
readelf -Wa f35/kexec_file.o | grep SECTION
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
    12: 0000000000000000     0 SECTION LOCAL  DEFAULT   17 
    21: 0000000000000000     0 SECTION LOCAL  DEFAULT   18 
    22: 0000000000000000     0 SECTION LOCAL  DEFAULT   20 
readelf -Wa rhel9/kexec_file.o | grep SECTION
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
    10: 0000000000000000     0 SECTION LOCAL  DEFAULT    9 
    12: 0000000000000000     0 SECTION LOCAL  DEFAULT   11 
    13: 0000000000000000     0 SECTION LOCAL  DEFAULT   13 
    14: 0000000000000000     0 SECTION LOCAL  DEFAULT   15 
    17: 0000000000000000     0 SECTION LOCAL  DEFAULT   17 
    26: 0000000000000000     0 SECTION LOCAL  DEFAULT   18 
    27: 0000000000000000     0 SECTION LOCAL  DEFAULT   20 
    28: 0000000000000000     0 SECTION LOCAL  DEFAULT   22 
    30: 0000000000000000     0 SECTION LOCAL  DEFAULT   24 
    32: 0000000000000000     0 SECTION LOCAL  DEFAULT   26 
    35: 0000000000000000     0 SECTION LOCAL  DEFAULT   29 
    37: 0000000000000000     0 SECTION LOCAL  DEFAULT   28 
Anyway, such symbols are created by the assembler, not gcc, so most likely this would be some binutils difference.
If you say fc34 works too, I think it would be nice if you could upload kexec_file.s from F34, F35 and RHEL9 (so that we could quickly rule out gcc as culprit if possible;
I think all of fc34, fc35 and rhel9 use moreless the same compiler, fc34 and fc35 the same, rhel9 applies a few patches on top of that and defaults to -mcpu=power9 rather
than -mcpu=power8, but as you use explicit -mcpu=power8, that shouldn't change much).
I think STT_SECTION symbols are only strictly required if there are any relocations against them which seems to be the case in the snippets you wrote in #c0, but doesn't seem to be the case in the attached *.o files.

Comment 4 Coiby 2022-03-03 14:41:52 UTC
Created attachment 1864007 [details]
f34, f35 and rhel9 has the same kexec_file.s

(In reply to Jakub Jelinek from comment #3)
> Thanks, though I'm confused on what to look for.

.text.unlikely has Nr=11 so we need to find the corresponding section symbol with Ndx=11 in the symbol table. For F35, we can't find a section symbol with Ndx=11.

> Looking at the kexec_file.o files you've uploaded, I see no
> .rela.__mcount_loc section and the only spots that mention .text.unlikely in

The kexec_file.o files are the direct result of gcc. They haven't be processed by linux/script/recordmcount thus no .rela.__mcount_loc section. You can build  linux/script/recordmcount (gcc linux/script/recordmcount.c -o linux/script/recordmcount) and run  "linux/script/recordmcount linux/kernel/kexec_file.o", then you will see the rela.__mcount_loc section,

 Relocation section '.rela__mcount_loc' at offset 0x70e0 contains 20 entries:
   Offset          Info           Type           Sym. Value    Sym. Name + Addend
 ...
 000000000090  000c00000026 R_PPC64_ADDR64    0000000000000000 .text.unlikely + c
 000000000098  000c00000026 R_PPC64_ADDR64    0000000000000000 .text.unlikely + 4c


> readelf -Wa are:
> readelf -Wa */kexec_file.o | grep .text.unlikely
>   [11] .text.unlikely    PROGBITS        0000000000000000 0029c0 000080 00 
> AX  0   0  4
>   [12] .rela.text.unlikely RELA            0000000000000000 005830 000120 18
> I 30  11  8
> Relocation section '.rela.text.unlikely' at offset 0x5830 contains 12
> entries:
>   [11] .text.unlikely    PROGBITS        0000000000000000 0029c0 000080 00 
> AX  0   0  4
>   [12] .rela.text.unlikely RELA            0000000000000000 005920 000120 18
> I 30  11  8
> Relocation section '.rela.text.unlikely' at offset 0x5920 contains 12
> entries:
> It is true that rhel9 has far more SECTION symbols:

This could be another clue about this issue.

> readelf -Wa f35/kexec_file.o | grep SECTION
>      2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
>      3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
>      5: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
>      7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
>     12: 0000000000000000     0 SECTION LOCAL  DEFAULT   17 
>     21: 0000000000000000     0 SECTION LOCAL  DEFAULT   18 
>     22: 0000000000000000     0 SECTION LOCAL  DEFAULT   20 
> readelf -Wa rhel9/kexec_file.o | grep SECTION
>      2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
>      3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
>      4: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
>      6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
>      8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
>     10: 0000000000000000     0 SECTION LOCAL  DEFAULT    9 
>     12: 0000000000000000     0 SECTION LOCAL  DEFAULT   11 
>     13: 0000000000000000     0 SECTION LOCAL  DEFAULT   13 
>     14: 0000000000000000     0 SECTION LOCAL  DEFAULT   15 
>     17: 0000000000000000     0 SECTION LOCAL  DEFAULT   17 
>     26: 0000000000000000     0 SECTION LOCAL  DEFAULT   18 
>     27: 0000000000000000     0 SECTION LOCAL  DEFAULT   20 
>     28: 0000000000000000     0 SECTION LOCAL  DEFAULT   22 
>     30: 0000000000000000     0 SECTION LOCAL  DEFAULT   24 
>     32: 0000000000000000     0 SECTION LOCAL  DEFAULT   26 
>     35: 0000000000000000     0 SECTION LOCAL  DEFAULT   29 
>     37: 0000000000000000     0 SECTION LOCAL  DEFAULT   28 
> Anyway, such symbols are created by the assembler, not gcc, so most likely
> this would be some binutils difference.
> If you say fc34 works too, I think it would be nice if you could upload
> kexec_file.s from F34, F35 and RHEL9 (so that we could quickly rule out gcc
> as culprit if possible;

For f34, f34 and RHEL9, the kexec_file.s files are the same. So I only upload one file.

> I think all of fc34, fc35 and rhel9 use moreless the same compiler, fc34 and
> fc35 the same, rhel9 applies a few patches on top of that and defaults to
> -mcpu=power9 rather
> than -mcpu=power8, but as you use explicit -mcpu=power8, that shouldn't
> change much).
> I think STT_SECTION symbols are only strictly required if there are any
> relocations against them which seems to be the case in the snippets you
> wrote in #c0, but doesn't seem to be the case in the attached *.o files.

All the files have .rela.text.unlikely which I would interpret as the need for relocation, 

 Relocation section '.rela.text.unlikely' at offset 0x5920 contains 12 entries:
   Offset          Info           Type           Sym. Value    Sym. Name + Addend
 000000000000  0026000000fc R_PPC64_REL16_HA  0000000000000000 .TOC. + 0 
 000000000004  0026000000fa R_PPC64_REL16_LO  0000000000000000 .TOC. + 4
 00000000000c  00270000000a R_PPC64_REL24     0000000000000000 _mcount + 0

Comment 5 Jakub Jelinek 2022-03-03 14:57:57 UTC
As the *.s file is the same, reassigning to binutils.
That said, http://www.sco.com/developers/gabi/latest/ch4.symtab.html
says:
STT_SECTION
    The symbol is associated with a section. Symbol table entries of this type exist primarily for relocation and normally have STB_LOCAL binding. 
There is e.g. STT_SECTION for section 1 (in f35/kexec_file.o it is .text) because there are relocations that refer to it, e.g.:
Relocation section '.rela__jump_table' at offset 0x5758 contains 9 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  000000020000001a R_PPC64_REL32          0000000000000000 .text + 750
0000000000000004  000000020000001a R_PPC64_REL32          0000000000000000 .text + 780

     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
has index 2 and is mentioned in rela_info of the relocations (which is why readelf renders it as .text).
There is no relocation that refers to .text.unlikely section symbol, so IMHO that section symbol doesn't need
to be present.  If recordmcount creates relocations that need that section symbol, IMHO it should create that STT_SECTION symbol if it isn't present.

Comment 6 Nick Clifton 2022-03-04 11:25:56 UTC
Hi Colby,

Jakub is exactly right:

> There is no relocation that refers to .text.unlikely section symbol, so IMHO
> that section symbol doesn't need to be present.  

This is precisely why the section symbol is absent.

An optimization was added to the assembler in binutils 2.36 to stop it producing
unneeded section symbols.  This saves space in object files and also helps with
compatibility with LLVM.  See:

  https://sourceware.org/bugzilla/show_bug.cgi?id=27109

Is is possible to change recordmcount to create the section symbol for itself ?

In theory I could add an option to the assembler to force it to always generate
section symbols, but if recordmcount can be fixed instead then I would prefer that...

Cheers
  Nick

Comment 7 Nick Clifton 2022-03-07 12:34:15 UTC
FYI - I proposed a patch to implement the assembler option, but it was pointed out that this issue has already been fixed in the kernel:

https://sourceware.org/pipermail/binutils/2022-March/119937.html

Comment 8 Coiby 2022-03-08 06:23:27 UTC
(In reply to Jakub Jelinek from comment #5)
> As the *.s file is the same, reassigning to binutils.
> That said, http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> says:
> STT_SECTION
>     The symbol is associated with a section. Symbol table entries of this
> type exist primarily for relocation and normally have STB_LOCAL binding. 
> There is e.g. STT_SECTION for section 1 (in f35/kexec_file.o it is .text)
> because there are relocations that refer to it, e.g.:
> Relocation section '.rela__jump_table' at offset 0x5758 contains 9 entries:
>     Offset             Info             Type               Symbol's Value 
> Symbol's Name + Addend
> 0000000000000000  000000020000001a R_PPC64_REL32          0000000000000000
> .text + 750
> 0000000000000004  000000020000001a R_PPC64_REL32          0000000000000000
> .text + 780
> 
>      2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
> has index 2 and is mentioned in rela_info of the relocations (which is why
> readelf renders it as .text).
> There is no relocation that refers to .text.unlikely section symbol, so IMHO
> that section symbol doesn't need
> to be present.  If recordmcount creates relocations that need that section
> symbol, IMHO it should create that STT_SECTION symbol if it isn't present.

Thanks for pinpointing the root cause and the detail explanation!

Comment 9 Coiby 2022-03-08 06:32:10 UTC
(In reply to Nick Clifton from comment #6)
> Hi Colby,

H Nick,

> 
> Jakub is exactly right:
> 
> > There is no relocation that refers to .text.unlikely section symbol, so IMHO
> > that section symbol doesn't need to be present.  
> 
> This is precisely why the section symbol is absent.
> 
> An optimization was added to the assembler in binutils 2.36 to stop it
> producing
> unneeded section symbols.  This saves space in object files and also helps
> with
> compatibility with LLVM.  See:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=27109

Thanks for confirming the analysis of Jakub and pointing me to the change of binutils!

> 
> Is is possible to change recordmcount to create the section symbol for
> itself ?

Actually, I'v already already created Bug 2059842 for recordmcount.c because I notice the perl version (linux/scripts/recordmcount.pl) uses a global symbol instead of a section symbol. I'm yet to fully understand how recordmcount.c works. I'll try to bring the attention of  recordmcount.c developer to this suggestion first.

>  
> In theory I could add an option to the assembler to force it to always
> generate
> section symbols, but if recordmcount can be fixed instead then I would
> prefer that...

> 
> Cheers
>   Nick

Comment 10 Ben Cotton 2022-11-29 17:59:00 UTC
This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 11 Ben Cotton 2022-12-13 16:48:23 UTC
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.