Bug 1684303

Summary: gdb-8.2-5 has troubles with debuginfo files from corosync-3.0.0-2.el8 (x86_64)
Product: Red Hat Enterprise Linux 8 Reporter: Jan Pokorný [poki] <jpokorny>
Component: gdbAssignee: Keith Seitz <keiths>
gdb sub component: system-version QA Contact: Michal Kolar <mkolar>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: dsmith, gdb-bugs, hhan, jfriesse, mcermak, mjw, ohudlick
Version: 8.0Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 8.2-7.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 15:42:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1630926    
Bug Blocks: 1755139    

Description Jan Pokorný [poki] 2019-02-28 22:40:18 UTC
I'm unclear on whether this is related to [bug 1656088], but there are
several things not working as expected:

1. plain corosync debugging

# rpm -qa | grep corosync
> corosynclib-3.0.0-2.el8.x86_64
> corosync-3.0.0-2.el8.x86_64
> corosync-debugsource-3.0.0-2.el8.x86_64
> corosynclib-debuginfo-3.0.0-2.el8.x86_64
> corosync-debuginfo-3.0.0-2.el8.x86_64

# gdb -args corosync
> [...]
> Reading symbols from corosync...Reading symbols from /usr/lib/debug/usr/sbin/corosync-3.0.0-2.el8.x86_64.debug...done.
> done.
> DW_FORM_strp pointing outside of .debug_str section [in module /usr/lib/debug/usr/sbin/corosync-3.0.0-2.el8.x86_64.debug]
> (gdb)


2. debugging usage of corosync libraries

# dnf install pacemaker

# gdb -args pacemakerd
> [...]
> (gdb) b cmap_initialize
>> Breakpoint 1 at 0x3730
> (gdb) run
>> Starting program: /usr/sbin/pacemakerd 
>> Missing separate debuginfos, use: dnf debuginfo-install glibc-2.28-42.el8.x86_64
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> warning: Loadable section ".note.gnu.property" outside of ELF segments
>> Segmentation fault (core dumped)

-->

> kernel: gdb[16288]: segfault at 0 ip 000056329f4b8c80 sp 00007ffc60786f40 error 4 in gdb[56329f139000+8da000]


Similar brokeness is observed with Fedora:
gdb-8.2.50.20190120-15.fc30.x86_64
corosync-3.0.1-2.fc30.x86_64

Comment 1 Sergio Durigan Junior 2019-02-28 23:55:16 UTC
I was able to reproduce both failures on FSF GDB, so it's an upstream bug, not caused by our local patches.

The segmentation fault happens on dwarf2read.c:parse_macro_definition, because "body" is NULL.  I tried investigating if the same problem happens with older versions of GDB, but unfortunately they don't compile anymore on Rawhide.

Comment 2 Keith Seitz 2019-03-04 21:27:01 UTC
Re: older versions of GDB -- Sergio, yes, I tested this on Fedora 29
by copying the appropriate .debug files from my rawhide VM. I'm pretty
confident that this will happen in any version of GDB.

I can't speak for the warnings (issue #1), but for issue #2, there is definitely
something amiss. A (slightly) simplified reproducer for this is:

$ gdb -readnow /usr/lib/debug/usr/lib64/libcmap.so.4.1.0-3.0.1-2.fc30.x86_64.debug
GNU gdb (GDB) Fedora 8.2.50.20190120-15.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/debug/usr/lib64/libcmap.so.4.1.0-3.0.1-2.fc30.x86_64.debug...

warning: Loadable section ".note.gnu.property" outside of ELF segments
Expanding full symbols from /usr/lib/debug/usr/lib64/libcmap.so.4.1.0-3.0.1-2.fc30.x86_64.debug...
Segmentation fault (core dumped)

As Sergio notes, this occurs because of a NULL pointer dereference in the macro
reading code.

I was curious why this was the case, so I dug a little deeper.

Looking at the debug info for libcmap.so, I see (on rawhide/Fedora 30) (eu-readelf -w)
["truncated" messages are from me]:

DWARF section [29] '.debug_info' at offset 0x16d8:
 [Offset]
 Compilation unit at offset 0:
 Version: 4, Abbreviation section offset: 0, Address size: 8, Offset size: 4
 [     b]  compile_unit         abbrev: 96
           producer             (GNU_strp_alt) "GNU C17 9.0.1 20190129 (Red Hat 9.0.1-0.2) -m64 -mtune=generic -march=x86-64 -g -ggdb3 -O2 -O3 -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection=full -fPIC -fplugin=annobin"
           language             (data1) C99 (12)
           name                 (GNU_strp_alt) "cmap.c"
           comp_dir             (GNU_strp_alt) "/usr/src/debug/corosync-3.0.1-2.fc30.x86_64/lib"
           low_pc               (addr) +0x00000000000023f0 <cmap_inst_free>
           high_pc              (udata) 7825 (+0x0000000000004281 <.annobin_cmap.c_end>)
           stmt_list            (sec_offset) 0
           GNU_macros           (sec_offset) 0
  ** truncated **

Looking at offset 0x0 of .debug_macro:

DWARF section [35] '.debug_macro' at offset 0x12928:

 Offset:             0x0
 Version:            4
 Flag:               0x2
 Offset length:      4
 .debug_line offset: 0x0

 #include offset 0x663
 start_file 0, [1] /usr/src/debug/corosync-3.0.1-2.fc30.x86_64/lib/cmap.c
  start_file 0, [33] /usr/include/stdc-predef.h
   #include offset 0xe95
  end_file
  start_file 35, [34] ../include/corosync/config.h
   #include offset 0xeb1
  end_file
  ** truncated **

So the first thing we have is an import of the section at 0x663. This is:

 Offset:             0x663
 Version:            4
 Flag:               0x0
 Offset length:      4

 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 #define ^D, line 0 (indirect)
 #define ^E<95>^B<80><D4>, line 0 (indirect)
 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 #define CS_ERR_EXIST (6375), line 0 (sup)
 #define ^B<E1><CE>, line 0 (indirect)
 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 #define  (18161), line 0 (sup)
 ** truncated **

This is surely not correct. [Well, the CS_ERR_EXIST one looks okay!]

So I cloned the rawhide package and built it. From the resulting
library (corosync-3.0.1/lib/.libs/libcmap.so), I see instead:

 Offset:             0x663
 Version:            4
 Flag:               0x0
 Offset length:      4

 #define __STDC__ 1, line 0 (indirect)
 #define __STDC_VERSION__ 201710L, line 0 (indirect)
 #define __STDC_UTF_16__ 1, line 0 (indirect)
 #define __STDC_UTF_32__ 1, line 0 (indirect)
 #define __STDC_HOSTED__ 1, line 0 (indirect)
 #define __GNUC__ 9, line 0 (indirect)
 #define __GNUC_MINOR__ 0, line 0 (indirect)
 #define __GNUC_PATCHLEVEL__ 1, line 0 (indirect)
 #define __VERSION__ "9.0.1 20190209 (Red Hat 9.0.1-0.4)", line 0 (indirect)
 #define __GNUC_RH_RELEASE__ 0, line 0 (indirect)
 #define __ATOMIC_RELAXED 0, line 0 (indirect)
 #define __ATOMIC_SEQ_CST 5, line 0 (indirect)
 ** truncated **

This looks okay to me. Somewhere between the build and packaging, .debug_macro
is "corrupted." It almost looks fuzzed!

Nonetheless, looking at the NULL dereference, clearly GDB shouldn't do this. However,
even with that fixed, I fear you will likely still have debugging problems.

Comment 3 Sergio Durigan Junior 2019-03-04 21:46:33 UTC
(In reply to Keith Seitz from comment #2)
> So I cloned the rawhide package and built it. From the resulting
> library (corosync-3.0.1/lib/.libs/libcmap.so), I see instead:
> 
>  Offset:             0x663
>  Version:            4
>  Flag:               0x0
>  Offset length:      4
> 
>  #define __STDC__ 1, line 0 (indirect)
>  #define __STDC_VERSION__ 201710L, line 0 (indirect)
>  #define __STDC_UTF_16__ 1, line 0 (indirect)
>  #define __STDC_UTF_32__ 1, line 0 (indirect)
>  #define __STDC_HOSTED__ 1, line 0 (indirect)
>  #define __GNUC__ 9, line 0 (indirect)
>  #define __GNUC_MINOR__ 0, line 0 (indirect)
>  #define __GNUC_PATCHLEVEL__ 1, line 0 (indirect)
>  #define __VERSION__ "9.0.1 20190209 (Red Hat 9.0.1-0.4)", line 0 (indirect)
>  #define __GNUC_RH_RELEASE__ 0, line 0 (indirect)
>  #define __ATOMIC_RELAXED 0, line 0 (indirect)
>  #define __ATOMIC_SEQ_CST 5, line 0 (indirect)
>  ** truncated **
> 
> This looks okay to me. Somewhere between the build and packaging,
> .debug_macro
> is "corrupted." It almost looks fuzzed!

Thanks for the detailed analysis, Keith.

Not sure how you built the package, but I wonder if it's something introduced by the version of GCC we currently have on Rawhide.  It would be good to run a scratch build on Koji and examine the generated debuginfo packages.

Thanks.

Comment 4 Jan Pokorný [poki] 2019-03-06 23:19:15 UTC
Btw. makes me think it would be desirable to add "generated debuginfo
files can be read back to the current blessed debugger and possibly
some other tools (objdump?)" check into package sanity/gating tests,
to prevent any such disillusions being discovered when they are
acceptable the least (unlike now).

Comment 5 Mark Wielaard 2019-03-11 16:03:58 UTC
This is caused by rpm debugedit https://bugzilla.redhat.com/show_bug.cgi?id=1630926
The simple workaround for now is to rebuild any package (corosync in this case) with -g instead of -g3.

Comment 6 Mark Wielaard 2019-03-11 16:08:38 UTC
This is where corosync sets -ggdb3 in configure.ac:

# gdb flags
if test "x${GCC}" = xyes; then
        GDB_FLAGS="-ggdb3"
else
        GDB_FLAGS="-g"
fi

Simply replacing that with "-g" should solve this issue.

Comment 8 Jan Friesse 2019-03-12 07:39:11 UTC
@poki: Thank you for notice!

@Mark: Yes, -ggdb3 is added to get as much debug information as possible. I don't think "workaround" is really needed if bug 1630926 gets fixed in 8.1.0.

Comment 9 Mark Wielaard 2019-03-14 17:17:18 UTC
(In reply to Jan Friesse from comment #8)
> @Mark: Yes, -ggdb3 is added to get as much debug information as possible. I
> don't think "workaround" is really needed if bug 1630926 gets fixed in 8.1.0.

Yes, the rpm debugedit bug should also be fixed, but whether or not it will be, you will have to rebuild the package anyway.
Better to fix the compile flags to comply with the default build flags that rpmbuild uses. (I don't think -ggdb3 is really recommended).

Comment 10 Jan Pokorný [poki] 2019-03-14 19:08:36 UTC
re [comment 9]:

> (I don't think -ggdb3 is really recommended)

Touching on cause-consequence here is rather subtle thing in this case,
I'd say, since I can imagine using that flag was led with gradually built
dissatisfaction about "nondiscoverability" in the debugger when plain
"-g" was used (based on experience possibly going many years back).
I also seldom deal with situations like that (possibly macro related).
Would be nice to have full discoverability by default, at least for
cases where size of debuginfo files won't be blown out the window,
since it also means the most straightforward support can be provided
when needed (while not resorting to -Og like solutions having other
trade-offs).

I think that if this was a hard requirement, it would also get
enforced by some automated means.

That also implies the list of affected packages won't likely end
with corosync.

(but that is none of my immediate business, /me shuts up)

Comment 11 Mark Wielaard 2019-03-14 19:35:06 UTC
(In reply to Jan Pokorný [poki] from comment #10)
> re [comment 9]:
> 
> > (I don't think -ggdb3 is really recommended)
> 
> Touching on cause-consequence here is rather subtle thing in this case,
> I'd say, since I can imagine using that flag was led with gradually built
> dissatisfaction about "nondiscoverability" in the debugger when plain
> "-g" was used (based on experience possibly going many years back).
> I also seldom deal with situations like that (possibly macro related).
> Would be nice to have full discoverability by default, at least for
> cases where size of debuginfo files won't be blown out the window,
> since it also means the most straightforward support can be provided
> when needed (while not resorting to -Og like solutions having other
> trade-offs).

The point isn't about what a user that build their own packages might use or why. The point is simply that rpmbuild (or more precisely redhat-rpm-config) has a default set of build flags that includes -g, not -g3 or -ggdb3.

To resolve this bug and get correct debuginfo the package needs to be rebuild anyway. So it is good to do that by using the default build/debug flags.

There might indeed be other packages that accidentally used -ggdb3. And we should also solve the rpm debugedit bug. But to resolve this issue you don't have to wait for that.

Comment 12 Jan Friesse 2019-03-15 07:10:09 UTC
@Mark: Yep, I'm aware of need to rebuild.

But I'm not too keen to remove -ggdb3. Debug information are not super useful even today thanks to optimizations and plain -g would make it even worse.

Of course if bug 1630926 doesn't get fixed I will use -g, because (as you've pointed in other bug) it's better to have some info rather than no info.

Comment 13 Sergio Durigan Junior 2019-05-15 15:22:11 UTC
FWIW, I've pushed a fix for GDB upstream:

commit 7bede82892a06e6c26989803e70f53697392dcf9 (HEAD -> master, origin/master, origin/HEAD)
Author: Sergio Durigan Junior <sergiodj>
Date:   Fri May 10 16:57:26 2019 -0400

    Don't crash if dwarf_decode_macro_bytes's 'body' is NULL

Comment 14 Sergio Durigan Junior 2019-05-15 15:23:16 UTC
It's important to mention that the bug is caused by rpm-build's "debugedit", which corrupts the .debug_macro section.  Bug 1708786 has been opened to track this.

Comment 15 Jan Friesse 2019-05-15 15:59:03 UTC
@Sergio: Thanks for update. Hopefully we will get fix for both bug 1708786 and bug 1630926 soon.

Comment 17 Keith Seitz 2019-10-09 18:51:34 UTC
*** Bug 1751253 has been marked as a duplicate of this bug. ***

Comment 18 Keith Seitz 2019-10-17 16:20:22 UTC
I've backported the patch to 8.2 branch.

Comment 20 Michal Kolar 2020-02-25 12:03:34 UTC
Reproduced against gdb-8.2-6.el8 and verified against gdb-8.2-11.el8.

Comment 22 errata-xmlrpc 2020-04-28 15:42:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1635