Created attachment 1855514 [details] C++ source to reproduce crash Description of problem: GDB crashes while stepping through a valid program. Version-Release number of selected component (if applicable): gdb-11.1-5.fc34.x86_64 gdb-11.1-5.fc35.x86_64 How reproducible: Always. Steps to Reproduce: Using the attached fs_ops.cc file on F35: $ rpm -q gcc-c++ gdb gcc-c++-11.2.1-7.fc35.ppc64le gdb-11.1-5.fc35.ppc64le $ g++ -g fs_ops.cc $ gdb -q -ex start -ex n -ex n -ex step -ex n -ex step -ex finish -ex cont -ex 'print \"finished\"' a.out Reading symbols from a.out... Temporary breakpoint 1 at 0x1000f050: file fs_ops.cc, line 2350. Starting program: /tmp/a.out [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Temporary breakpoint 1, main () at fs_ops.cc:2350 2350 std::error_code ec = std::make_error_code(std::errc::invalid_argument); 2351 std::filesystem::path p = "foo/a"; 2352 std::filesystem::remove_all(p, ec); std::filesystem::remove_all (p=filesystem::path "foo/a" = {...}, ec=std::error_code = {"generic": EINVAL}) at fs_ops.cc:2032 2032 ec.clear(); 2033 return fs::do_remove_all(p, ErrorReporter{ec}); std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffe880, ec=std::error_code = { }) at fs_ops.cc:1942 1942 ErrorReporter(error_code& ec) : code(&ec) Run till exit from #0 std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffe880, ec=std::error_code = { }) at fs_ops.cc:1942 Aborted (core dumped) Or use the attached a-fs_ops.ii file (for x86_64) and run these commands: mock -q -r fedora-35-x86_64 --install gdb gcc-c++ mock -q -r fedora-35-x86_64 --dnf-cmd debuginfo-install glibc-2.34-11.fc35.x86_64 libgcc-11.2.1-7.fc35.x86_64 libstdc++-11.2.1-7.fc35.x86_64 # Copy the attached file into the mock root: cp a-fs_ops.ii /var/lib/mock/fedora-35-x86_64/root/tmp/ mock -q -r fedora-35-x86_64 --chroot "cd /tmp && g++ -g a-fs_ops.ii" # Test non-interactively: mock -q -r fedora-35-x86_64 --chroot "cd /tmp && gdb -q -ex start -ex n -ex n -ex step -ex n -ex step -ex finish -ex cont -ex 'print \"finished\"' a.out ; echo $?" mock -q -r fedora-35-x86_64 --shell # These commands should be run in the mock shell: cd /tmp gdb -q -ex start -ex n -ex n -ex step -ex n -ex step -ex finish a.out Actual results: Aborted (core dumped) Expected results: GDB 'finish' runs the function to completion Additional info: When trying to run the mock commands above entirely non-interactively it *sometimes* prints '0' after the GDB command, but this seems to be a mock bug. The GDB process aborts before running the 'cont' and 'print "finished"' commands: $ mock -q -r fedora-35-x86_64 --chroot "cd /tmp && gdb -q -ex start -ex n -ex n -ex step -ex n -ex step -ex finish -ex cont -ex 'print \"finished\"' a.out ; echo $?" Reading symbols from a.out... Temporary breakpoint 1 at 0x40cdae: file fs_ops.cc, line 2350. Starting program: /tmp/a.out [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Temporary breakpoint 1, main () at fs_ops.cc:2350 2350 fs_ops.cc: No such file or directory. 2351 in fs_ops.cc 2352 in fs_ops.cc std::filesystem::remove_all (p=filesystem::path "foo/a" = {...}, ec=std::error_code = {"generic": EINVAL}) at fs_ops.cc:2032 2032 in fs_ops.cc 2033 in fs_ops.cc std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffea70, ec=std::error_code = { }) at fs_ops.cc:1942 1942 in fs_ops.cc Run till exit from #0 std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffea70, ec=std::error_code = { }) at fs_ops.cc:1942 0 This *should* be printing 134, the shell exit code for SIGABRT. When run interactively with mock --shell the abort is always seen: $ mock -q -r fedora-35-x86_64 --shell <mock-chroot> sh-5.1# cd /tmp <mock-chroot> sh-5.1# gdb -q -ex start -ex n -ex n -ex step -ex n -ex step -ex finish a.out Reading symbols from a.out... Download failed: No route to host. Continuing without source file /tmp/fs_ops.cc. Temporary breakpoint 1 at 0x40cdae: file fs_ops.cc, line 2350. Starting program: /tmp/a.out Download failed: No route to host. Continuing without debug info for /tmp/system-supplied DSO at 0x7ffff7fc8000. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Temporary breakpoint 1, main () at fs_ops.cc:2350 Download failed: No route to host. Continuing without source file /tmp/fs_ops.cc. 2350 fs_ops.cc: No such file or directory. 2351 in fs_ops.cc 2352 in fs_ops.cc std::filesystem::remove_all (p=filesystem::path "foo/a" = {...}, ec=std::error_code = {"generic": EINVAL}) at fs_ops.cc:2032 2032 in fs_ops.cc 2033 in fs_ops.cc std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffe2f0, ec=std::error_code = { }) at fs_ops.cc:1942 1942 in fs_ops.cc Run till exit from #0 std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffe2f0, ec=std::error_code = { }) at fs_ops.cc:1942 Aborted (core dumped) <mock-chroot> sh-5.1# echo $? 134 It's 100% reproducible on a real F35 system without mock anyway. The same crash happens on F34 but not with the system g++, only when using a self-built gcc 11.2.1 12.0.1 to compile the code. This suggests maybe some change in GCC's debuginfo between F34's: gcc version 11.2.1 20210728 (Red Hat 11.2.1-1) (GCC) and F35's: gcc version 11.2.1 20211203 (Red Hat 11.2.1-7) (GCC) Either way, GDB should not abort.
Created attachment 1855515 [details] Preprocessed C++ source to reproduce crash (a-fs_ops.ii)
I can't reproduce it on rawhide because of: ../../gdb/objfiles.h:510: internal-error: sect_index_data not initialized A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) n But I think a fix for that is on the way.
Running "up" also crashes it. I think it's something to do with printing the context of the previous stack frame.
All I get when running gdb under gdb is: "gdb" received signal SIGSEGV, Segmentation fault. 0x00005555557a98d2 in scoped_debug_start_end::scoped_debug_start_end (this=0x7fffff7ff0c0, debug_enabled=@0x55555626ffc0: false, module=0x555555e59f87 "frame", func=0x555555e0510b "frame_unwind_find_by_frame", start_prefix=0x555555ec09c9 "enter", end_prefix=0x555555e4279c "exit", fmt=0x0) at ../../gdb/../gdbsupport/common-debug.h:108 108 scoped_debug_start_end (bool &debug_enabled, const char *module,
looks like a stack overflow, I get tens of thousands of frames like this: #0 0x00005555557a98d2 in scoped_debug_start_end::scoped_debug_start_end (this=0x7fffff7ff0d0, debug_enabled=@0x55555626ffc0: false, module=0x555555e59f87 "frame", func=0x555555e0510b "frame_unwind_find_by_frame", start_prefix=0x555555ec09c9 "enter", end_prefix=0x555555e4279c "exit", fmt=0x0) at ../../gdb/../gdbsupport/common-debug.h:108 #1 0x00005555558d9012 in frame_unwind_find_by_frame (this_frame=this_frame@entry=0x555557b78010, this_cache=this_cache@entry=0x555557b78028) at ../../gdb/frame-unwind.c:181 #2 0x00005555558da0a9 in frame_unwind_arch (next_frame=0x555557b78010) at ../../gdb/frame.c:2863 #3 0x00005555558da6b0 in get_frame_arch (this_frame=this_frame@entry=0x555557b78010) at ../../gdb/frame.c:2852 #4 0x00005555558d9044 in frame_unwind_find_by_frame (this_frame=this_frame@entry=0x555557b78010, this_cache=this_cache@entry=0x555557b78028) at ../../gdb/frame-unwind.c:184 #5 0x00005555558da0a9 in frame_unwind_arch (next_frame=0x555557b78010) at ../../gdb/frame.c:2863 #6 0x00005555558da6b0 in get_frame_arch (this_frame=this_frame@entry=0x555557b78010) at ../../gdb/frame.c:2852 #7 0x00005555558d9044 in frame_unwind_find_by_frame (this_frame=this_frame@entry=0x555557b78010, this_cache=this_cache@entry=0x555557b78028) at ../../gdb/frame-unwind.c:184 #8 0x00005555558da0a9 in frame_unwind_arch (next_frame=0x555557b78010) at ../../gdb/frame.c:2863
If I compile my code with -fno-omit-frame-pointer gdb crashes sooner and differently (which is probably a separate bug): Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x00005555558eab1a in gdbarch_num_regs (gdbarch=0x7000600070007) at ../../gdb/gdbarch.c:2120 2120 gdb_assert (gdbarch->num_regs != -1); (gdb) bt #0 0x00005555558eab1a in gdbarch_num_regs (gdbarch=0x7000600070007) at ../../gdb/gdbarch.c:2120 #1 0x0000555555a75938 in reg_buffer::num_raw_registers (this=0x5555572153f0) at ../../gdb/regcache.c:225 #2 readable_regcache::cooked_read (this=0x5555572153f0, regnum=16, buf=0x5555571379c0 "") at ../../gdb/regcache.c:692 #3 0x000055555584fd5d in dummy_frame_prev_register (this_frame=<optimized out>, this_prologue_cache=<optimized out>, regnum=16) at ../../gdb/dummy-frame.c:356 #4 0x00005555558da8a1 in frame_unwind_register_value (next_frame=0x5555564364c0, regnum=16) at ../../gdb/frame.c:1233 #5 0x00005555558dacd3 in frame_register_unwind (next_frame=next_frame@entry=0x5555564364c0, regnum=regnum@entry=16, optimizedp=optimizedp@entry=0x7fffffffcd20, unavailablep=unavailablep@entry=0x7fffffffcd24, lvalp=lvalp@entry=0x7fffffffcd2c, addrp=addrp@entry=0x7fffffffcd30, realnump=0x7fffffffcd28, bufferp=0x7fffffffcd50 "\340AcVUU") at ../../gdb/frame.c:1143 #6 0x00005555558db11f in frame_unwind_register (next_frame=next_frame@entry=0x5555564364c0, regnum=16, buf=buf@entry=0x7fffffffcd50 "\340AcVUU") at ../../gdb/frame.c:1199 #7 0x0000555555927a48 in i386_unwind_pc (gdbarch=0x5555566341e0, next_frame=0x5555564364c0) at ../../gdb/i386-tdep.c:1970 #8 0x00005555558da108 in frame_unwind_pc (this_frame=0x5555564364c0) at ../../gdb/frame.c:948 #9 0x00005555558da1ee in get_frame_pc_if_available (frame=frame@entry=0x55555782ac90, pc=pc@entry=0x7fffffffcee0) at ../../gdb/frame.c:2549 #10 0x0000555555b16981 in print_frame_info (fp_opts=..., frame=0x55555782ac90, print_level=<optimized out>, print_what=SRC_AND_LOC, print_args=<optimized out>, set_current_sal=1) at ../../gdb/stack.c:1188 #11 0x0000555555b17355 in print_stack_frame (frame=0x55555782ac90, print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at ../../gdb/stack.c:366 #12 0x0000555555b17407 in print_stack_frame_to_uiout (uiout=0x555556537650, frame=0x55555782ac90, print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at ../../gdb/stack.c:345 #13 0x0000555555b97dff in tui_on_user_selected_context_changed (selection=...) at ../../gdb/tui/tui-interp.c:231 #14 0x0000555555b1425e in std::function<void (enum_flags<user_selected_what_flag>)>::operator()(enum_flags<user_selected_what_flag>) const (__args#0=..., this=0x555556499c60) at /usr/include/c++/11/bits/std_function.h:560 #15 gdb::observers::observable<enum_flags<user_selected_what_flag> >::notify (args#0=..., this=<optimized out>) at ../../gdb/../gdbsupport/observable.h:150 #16 up_command (count_exp=<optimized out>, from_tty=<optimized out>) at ../../gdb/stack.c:2693 #17 0x00005555557e6f2a in cmd_func (cmd=<optimized out>, args=<optimized out>, from_tty=<optimized out>) at ../../gdb/cli/cli-decode.c:2160 #18 0x0000555555b7c747 in execute_command (p=<optimized out>, p@entry=<error reading variable: value has been optimized out>, from_tty=1, from_tty@entry=<error reading variable: value has been optimized out>) at ../../gdb/top.c:674 #19 0x000055555599fff3 in catch_command_errors (command=<optimized out>, arg=<optimized out>, from_tty=<optimized out>, do_bp_actions=<optimized out>) at ../../gdb/main.c:523 #20 0x00005555559a00c2 in execute_cmdargs (cmdarg_vec=cmdarg_vec@entry=0x7fffffffd4a0, file_type=file_type@entry=CMDARG_FILE, cmd_type=cmd_type@entry=CMDARG_COMMAND, ret=ret@entry=0x7fffffffd49c) at ../../gdb/main.c:618 #21 0x00005555559a1a36 in captured_main_1 (context=<optimized out>) at ../../gdb/main.c:1322 #22 0x00005555559a25cf in captured_main (data=<optimized out>) at ../../gdb/main.c:1343 #23 gdb_main (args=<optimized out>) at ../../gdb/main.c:1368 #24 0x00005555556ec9b0 in main (argc=<optimized out>, argv=<optimized out>) at ../../gdb/gdb.c:40
The crash is caused by a libstdc++ pretty printer. Probably the one for std::error_code. I don't know how that causes GDB to go into a loop though.
It's this part of the std::error_code printer: @staticmethod def _category_name(cat): "Call the virtual function that overrides std::error_category::name()" gdb.set_convenience_variable('__cat', cat) return gdb.parse_and_eval('$__cat->name()').string() If I replace that with just return "" then GDB shows: Run till exit from #0 std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffd720, ec=std::error_code = {"": 0}) at fs_ops.cc:1942 std::filesystem::remove_all (p=filesystem::path "foo/a"<error reading variable: Cannot access memory at address 0x430eb000>, ec=std::error_code = {"": 0}) at fs_ops.cc:2033 2033 return fs::do_remove_all(p, ErrorReporter{ec}); With the printer it shows: Run till exit from #0 std::filesystem::(anonymous namespace)::ErrorReporter::ErrorReporter (this=0x7fffffffd720, ec=std::error_code = { }) at fs_ops.cc:1942 Aborted (core dumped)
The crash seems to happen when trying to print the details of a stack frame, which happens when running 'up' or 'finish' or 'bt'. That I only saw it with inline functions was a red herring, what matters is that it has a std::error_code& parameter. I think the value of the std::error_code& parameter is garbage while entering or leaving the function, so when the printer tries to call a virtual function it goes off into the weeds. This reproduces it: #include <system_error> int f(std::error_code& ec) { ec.assign(1, std::system_category()); return ec.value(); } int g(std::error_code& ec) { return f(ec); } int main() { std::error_code ec; return g(ec); } g++ ec.C -g gdb -q -ex start -ex n -ex step -ex step -ex n -ex up -ex up a.out Reading symbols from a.out... Temporary breakpoint 1 at 0x40117b: file ec.C, line 16. Starting program: /tmp/a.out Temporary breakpoint 1, main () at ec.C:16 16 std::error_code ec; 17 return g(ec); g (ec=std::error_code = { }) at ec.C:11 11 return f(ec); f (ec=std::error_code = { }) at ec.C:5 5 ec.assign(1, std::system_category()); 6 return ec.value(); #1 0x0000000000401171 in g (ec=std::error_code = {"system": EPERM}) at ec.C:11 11 return f(ec); Aborted (core dumped)
Reported upstream with a minimal reproducer that doesn't depend on libstdc++ printers. https://sourceware.org/bugzilla/show_bug.cgi?id=28856
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Works in F36