Bug 1661199

Summary: gdb crashes in std::vector<context_stack, std::allocator<context_stack> >::empty() when reading core/debug symbols
Product: [Fedora] Fedora Reporter: Jan Pokorný [poki] <jpokorny>
Component: gdbAssignee: Sergio Durigan Junior <sergiodj>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jan.kratochvil, keiths, kevinb, pmuldoon, sergiodj
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-20 16:49:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Pokorný [poki] 2018-12-20 11:48:45 UTC
With gdb-8.2.50.20181130-11.fc30.x86_64, I am unable to successfully
load the core file generated for python3.7 of python3-3.7.1-4.fc30.x86_64
running dnf that crashed for me (cf. [bug 1610456]) since gdb crashes.

Will attach this core as a private attachment, since there's quite
a lot of data in there that I cannot reasonably check.

Anyway, the crashing thread for _gdb_ looks like this:

#0  0x000055ae7b41185c std::vector<context_stack, std::allocator<context_stack> >::empty() const (gdb)
#1  0x000055ae7b41a517 process_die (gdb)
#2  0x000055ae7b41905b inherit_abstract_dies (gdb)
#3  0x000055ae7b41944b read_func_scope (gdb)
#4  0x000055ae7b41a53d process_die (gdb)
#5  0x000055ae7b41f0cb read_file_scope (gdb)
#6  0x000055ae7b419bd5 process_die (gdb)
#7  0x000055ae7b41fa06 process_full_comp_unit (gdb)
#8  0x000055ae7b41ff07 dw2_instantiate_symtab (gdb)
#9  0x000055ae7b420018 dw2_find_pc_sect_compunit_symtab (gdb)
#10 0x000055ae7b62fa18 find_pc_sect_compunit_symtab(unsigned long, obj_section*) (gdb)
#11 0x000055ae7b440d59 select_frame(frame_info*) (gdb)
#12 0x000055ae7b442223 select_frame(frame_info*) (gdb)
#13 0x000055ae7b3b3372 core_target_open(char const*, int) (gdb)
#14 0x000055ae7b4ed3d8 catch_command_errors (gdb)
#15 0x000055ae7b4eea95 captured_main_1 (gdb)
#16 0x000055ae7b2c837f main (gdb)
#17 0x00007eff4cf00ee3 __libc_start_main (libc.so.6)
#18 0x000055ae7b2ccfbe _start (gdb)

(all other threads in pthread_cond_wait).

When I happen to Ctrl-C when debug symbols are being figured out
initially, I also see that I lack couple of other debuginfo
packages:

> Missing separate debuginfos, use: dnf debuginfo-install
> brotli-1.0.7-1.fc30.x86_64 bzip2-libs-1.0.6-28.fc29.x86_64
> cyrus-sasl-lib-2.1.27-0.4rc7.fc30.x86_64 elfutils-libelf-0.175-2.fc30.x86_64
> elfutils-libs-0.175-2.fc30.x86_64 expat-2.2.6-1.fc30.x86_64
> file-libs-5.35-2.fc30.x86_64 glib2-2.58.1-2.fc30.x86_64
> glibc-2.28.9000-27.fc30.x86_64 gpgme-1.11.1-3.fc29.x86_64
> ima-evm-utils-1.1-4.fc29.x86_64 json-c-0.13.1-3.fc29.x86_64
> keyutils-libs-1.6-1.fc30.x86_64 krb5-libs-1.17-1.beta2.1.fc30.x86_64
> libacl-2.2.53-2.fc29.x86_64 libassuan-2.5.1-4.fc29.x86_64
> libattr-2.4.48-4.fc30.x86_64 libblkid-2.33-0.1.fc30.x86_64
> libcap-2.25-12.fc29.x86_64 libcom_err-1.44.4-1.fc30.x86_64
> libcomps-0.1.9-2gd29de45.fc30.x86_64 libcurl-7.63.0-2.fc30.x86_64
> libdb-5.3.28-34.fc30.x86_64 libffi-3.1-18.fc29.x86_64
> libgcc-8.2.1-6.fc30.x86_64 libgpg-error-1.31-2.fc29.x86_64
> libidn2-2.0.5-2.fc29.x86_64 libmodulemd1-1.8.0-1.fc30.x86_64
> libmount-2.33-0.1.fc30.x86_64 libpsl-0.20.2-5.fc29.x86_64
> librepo-1.9.2-1.fc30.x86_64 libselinux-2.8-5.fc30.x86_64
> libsmartcols-2.33-0.1.fc30.x86_64 libsolv-0.7.2-11g95dcddc7.fc30.x86_64
> libssh-0.8.5-1.fc30.x86_64 libstdc++-8.2.1-6.fc30.x86_64
> libunistring-0.9.10-4.fc29.x86_64 libuuid-2.33-0.1.fc30.x86_64
> libxcrypt-4.4.1-1.fc30.x86_64 libyaml-0.2.1-2.fc29.x86_64
> libzstd-1.3.6-1.fc30.x86_64 lua-libs-5.3.5-2.fc29.x86_64
> openldap-2.4.46-10.fc30.x86_64 openssl-libs-1.1.1-7.fc30.x86_64
> pcre-8.42-5.fc30.x86_64 pcre2-10.32-4.fc30.x86_64 popt-1.16-16.fc30.x86_64
> python3-gpg-1.11.1-3.fc29.x86_64 rpm-libs-4.14.2.1-3.fc30.x86_64
> sqlite-libs-3.26.0-1.fc30.x86_64 xz-libs-5.2.4-4.fc30.x86_64
> zlib-1.2.11-14.fc30.x86_64

but these are not relevant for me at the moment,
so I omit them; the only package I installed purposefully is
for said python3 version.  Btw. without such debuginfo package,
gdb appears to work, but only until I run "backtrace".

Comment 2 Sergio Durigan Junior 2018-12-20 16:49:06 UTC
Thanks for the report, Jan.  Further debugging shows that the problem happens because:

(top-gdb) up
#3  0x00000000005d980b in new_symbol (die=0x74c61f0, type=0x0, cu=0x716f1e0, space=0x0) at ../../binutils-gdb/gdb/dwarf2read.c:21607
21607                 = cu->builder->get_current_context_stack ();
(top-gdb) p cu.builder
$2 = std::unique_ptr<buildsym_compunit> = {get() = 0x0}

This is a known issue and we're tracking it on Bug 1638798, therefore I'm closing this bug as a duplicate.  The current status as I write this message is that we're waiting for upstream to review and approve Keith's patch to fix the issue.  Once that happens, I will release a Rawhide GDB containing the patch.

Thanks.

*** This bug has been marked as a duplicate of bug 1638798 ***

Comment 3 Jan Pokorný [poki] 2018-12-20 21:59:03 UTC
Thanks, interestingly, it's not that frequent to hit this problem
and now I see the original bug also deals with Python 3 interpreter :-)
(I looked for dupes first, I swear, must have missed that)

Comment 4 Sergio Durigan Junior 2018-12-21 02:51:57 UTC
(In reply to Jan Pokorný [poki] from comment #3)
> Thanks, interestingly, it's not that frequent to hit this problem
> and now I see the original bug also deals with Python 3 interpreter :-)

Yep :-).  There's a very specific DWARF layout that triggers this problem on Python 3's debuginfo.

> (I looked for dupes first, I swear, must have missed that)

No problem at all, it's hard to spot exactly where the problem happens sometimes.

Thanks.