Bug 1656165

Summary: readline library's tab completion facility may trigger SIGSEGV
Product: Red Hat Enterprise Linux 7 Reporter: Sterling Alexander <stalexan>
Component: crashAssignee: Dave Anderson <anderson>
Status: CLOSED ERRATA QA Contact: Emma Wu <xiawu>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.7CC: bhu, ruyang, sbarcomb, xiawu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: crash-7.2.3-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-06 12:41:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1647768    
Attachments:
Description Flags
Crash session that crashed none

Description Sterling Alexander 2018-12-04 21:09:47 UTC
Created attachment 1511468 [details]
Crash session that crashed

Description of problem:  Crash crashes when examining the following retrace task on optimus:

crash> sys
      KERNEL: /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/2.6.32-696.13.2.el6.x86_64/vmlinux
    DUMPFILE: /cores/retrace/tasks/110186978/crash/vmcore  [PARTIAL DUMP]
        CPUS: 8 [OFFLINE: 7]
        DATE: Mon Feb  5 04:13:11 2018
      UPTIME: 80 days, 07:36:30
LOAD AVERAGE: 145.48, 121.62, 81.48
       TASKS: 1093
    NODENAME: XXXXXXXXXX
     RELEASE: 2.6.32-696.13.2.el6.x86_64
     VERSION: #1 SMP Fri Sep 22 12:32:14 EDT 2017
     MACHINE: x86_64  (2396 Mhz)
      MEMORY: 32 GB
       PANIC: "Kernel panic - not syncing: hung_task: blocked tasks"



Version-Release number of selected component (if applicable):

$ crash --version

crash 7.2.4
Copyright (C) 2002-2017  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".


How reproducible:  Happened several times, full session log attached to the BZ


Steps to Reproduce:
1.  Use crash to analyse the core

Actual results:  Crash crashes


Expected results:  Crash doesn't crash


Additional info:

Comment 2 Dave Anderson 2018-12-05 15:18:18 UTC
I've never used the readline library's tab-completion feature in the crash
utility (I didn't even consider it being enabled).  I'm certainly not
familiar with the library's internals, so don't hold your breath awaiting
a fix.

Comment 4 Dave Anderson 2018-12-05 16:35:07 UTC
The failure can occur in multiple different paths, where the damage has
been done before the corruption is recognized.  Here's a couple more
relevant backtraces than the one in the attached file, where the failure
occurs while executing the readline() call:

crash> whatis mu*** Error in `./crash': free(): invalid pointer: 0x00007f095e62a000 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81489)[0x7f095d559489]
/lib64/libc.so.6(_IO_free_backup_area+0x1a)[0x7f095d55473a]
/lib64/libc.so.6(_IO_file_overflow+0x1d5)[0x7f095d553ea5]
/lib64/libc.so.6(_IO_file_xsputn+0xb0)[0x7f095d552810]
/lib64/libc.so.6(fputs+0xbb)[0x7f095d546e4b]
./crash[0x761e24]
./crash[0x762180]
./crash(fprintf_filtered+0x8c)[0x76320c]
./crash(throw_exception+0x63)[0x6a5cb3]
./crash[0x6a5f49]
./crash[0x6a6166]
./crash[0x760bc4]
./crash(c_parse_internal+0x32d6)[0x618a56]
./crash(c_parse+0x159)[0x618db9]
./crash[0x6d183a]
./crash(parse_expression_for_completion+0x71)[0x6d1b31]
./crash(expression_completer+0x76)[0x6b0726]
./crash[0x6afbf9]
./crash(readline_line_completion_function+0x59)[0x6b0669]
./crash(rl_completion_matches+0x61)[0x793331]
./crash(rl_complete_internal+0xf8)[0x793528]
./crash(_rl_dispatch_subseq+0x173)[0x78be63]
./crash(readline_internal_char+0x9f)[0x78c16f]
./crash(readline+0x45)[0x78c775]
./crash(process_command_line+0x1c3)[0x54f4a3]
./crash(main_loop+0x1e5)[0x467ed5]
./crash[0x6a7733]
./crash(catch_errors+0x7a)[0x6a645a]
./crash[0x6a86c6]
./crash(catch_errors+0x7a)[0x6a645a]
./crash(gdb_main_entry+0x47)[0x6a8a27]
./crash(main+0x775)[0x466265]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f095d4fa3d5]
./crash[0x46750e]

crash> whatis mu*** Error in `./crash': malloc(): memory corruption: 0x00007feeec3f7010 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x82c96)[0x7feeeb325c96]
/lib64/libc.so.6(+0x8382c)[0x7feeeb32682c]
/lib64/libc.so.6(realloc+0x1d2)[0x7feeeb328832]
./crash(xrealloc+0x1d)[0x7873ad]
./crash(vec_o_reserve+0x5f)[0x73bdff]
./crash[0x673fa6]
./crash(default_make_symbol_completion_list_break_on+0x3ac)[0x678a9c]
./crash(location_completer+0x322)[0x6b0222]
./crash(expression_completer+0x11e)[0x6b07ce]
./crash[0x6afbf9]
./crash(readline_line_completion_function+0x59)[0x6b0669]
./crash(rl_completion_matches+0x61)[0x793331]
./crash(rl_complete_internal+0xf8)[0x793528]
./crash(_rl_dispatch_subseq+0x173)[0x78be63]
./crash(readline_internal_char+0x9f)[0x78c16f]
./crash(readline+0x45)[0x78c775]
./crash(process_command_line+0x1c3)[0x54f4a3]
./crash(main_loop+0x1e5)[0x467ed5]
./crash[0x6a7733]
./crash(catch_errors+0x7a)[0x6a645a]
./crash[0x6a86c6]
./crash(catch_errors+0x7a)[0x6a645a]
./crash(gdb_main_entry+0x47)[0x6a8a27]
./crash(main+0x775)[0x466265]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7feeeb2c53d5]
./crash[0x46750e]

The rl_completion_matches() function is where the transition is made
from the readline library code to base gdb code.

Comment 5 Dave Anderson 2018-12-06 18:25:32 UTC
I can't really figure out how to effectively debug this, given that the damage
has been done by the time the malloc/free/corruption is detected.  Staring
at the code doesn't show anything obvious.  My best guess is that it has
more to do with the embedded gdb completion code than the readline library
itself.  Or perhaps it's an issue related to the crash/gdb marriage, where
it is the only place where gdb code is invoked directly without the top-level
crash utility invoking gdb through its well-defined interface.  That alone is
a little bit disconcerting.

Anyway, I think I'll take a look at writing a readline completer plugin,
which would take gdb totally out of the picture.  It should be faster than
using the gdb completer, and would also remove the useless clutter of 
showing filenames as a completion option, which makes no sense.

Comment 6 Dave Anderson 2018-12-07 20:36:24 UTC
> ...
> Anyway, I think I'll take a look at writing a readline completer plugin,
> which would take gdb totally out of the picture.  It should be faster than
> using the gdb completer, and would also remove the useless clutter of 
> showing filenames as a completion option, which makes no sense.

A patch has been applied upstream:

https://github.com/crash-utility/crash/commit/0f65ae0c36bf04e22219f28c32c3ae0cdee5acfe

  Implemented a new plugin function for the readline library's tab
  completion feature.  Without the patch, the use of the default plugin
  from the embedded gdb module has been seen to cause segmentation
  violations or other fatal malloc/free/corruption assertions.  The new
  plugin takes gdb out of the picture entirely, and also restricts the
  matching options to just symbol names, so as not to clutter the
  results with irrelevant filenames.
  (anderson)


Also, because the top-level crash code already has a symbol list, the new
plugin avoids having to do the malloc/realloc/frees that the gdb code
does in generating the list of matching options -- which is where I 
*believe* the reported problem lies.

Comment 13 errata-xmlrpc 2019-08-06 12:41:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2071