Bug 365111

Summary: gdb's "return VAL" command appears to misbehave
Product: [Fedora] Fedora Reporter: Jim Meyering <meyering>
Component: gdbAssignee: Jan Kratochvil <jan.kratochvil>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9CC: drepper, dvlasenk, jan.kratochvil, triage
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/11944
Whiteboard: bzcl34nup
Fixed In Version: 6.8.50.20090302-13.fc11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-30 19:42:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix (w/o a testcase so far). none

Description Jim Meyering 2007-11-03 11:14:29 UTC
Description of problem:


Version-Release number of selected component (if applicable): glibc-2.7-2


How reproducible: always


Steps to Reproduce:

glibc's stream output functions (e.g., printf, fputs, fwrite, etc.)
always allocate memory upon stream initialization.  The first output
appears to cause allocation of a 4KB block (a page).  I want to know
if that first output operation can fail without setting the stream
error indicator, like other ENOMEM failures do.  I tried to provoke this,
to see if a failure of that
precise allocation would provoke an ferror-detectable failure, but ran
into something else.  When that particular mmap call fails (I think it's
the one in filedoalloc.c from the ALLOC_BUF macro), it ends up causing
a segfault 5 or 6 levels up the stack.

Glancing through the code, this seems to happen because
_IO_new_file_overflow isn't prepared for a NULL f->_IO_buf_base pointer.

  $ printf '#include <stdio.h>\nint main(){printf("foo");return 0;}\n' > k.c
  $ gcc -pthread -g -Wall -W -O k.c
  $ gdb -q ./a.out
  Using host libthread_db library "/lib64/libthread_db.so.1".
  (gdb) b printf
  Function "printf" not defined.
  Make breakpoint pending on future shared library load? (y or [n]) y

  Breakpoint 1 (printf) pending.
  (gdb) r
  Starting program: /t/a/a.out

  Breakpoint 2 at 0x3c5084ca00
  Pending breakpoint "printf" resolved

  Breakpoint 2, 0x0000003c5084ca00 in printf () from /lib64/libc.so.6
  (gdb) b mmap64
  Breakpoint 3 at 0x3c508d18d0
  (gdb) c
  Continuing.

  Breakpoint 3, 0x0000003c508d18d0 in mmap64 () from /lib64/libc.so.6
  (gdb) ret -1
  Make selected stack frame return now? (y or n) y

  #0  0x0000003c508613db in _IO_file_doallocate_internal () from /lib64/libc.so.6
  (gdb) p errno=12
  $2 = 12
  (gdb) c
  Continuing.

  Program received signal SIGSEGV, Segmentation fault.
  0x0000003c5086c8dc in _IO_new_file_overflow () from /lib64/libc.so.6
  (gdb) bt
  #0  0x0000003c5086c8dc in _IO_new_file_overflow () from /lib64/libc.so.6
  #1  0x0000003c5086ec34 in _IO_default_xsputn_internal () from /lib64/libc.so.6
  #2  0x0000003c5086d881 in _IO_new_file_xsputn () from /lib64/libc.so.6
  #3  0x0000003c50842f50 in vfprintf () from /lib64/libc.so.6
  #4  0x0000003c5084ca9a in printf () from /lib64/libc.so.6
  #5  0x00000000004004cb in main () at k.c:2
  (gdb)

I simulated the mmap64 failure by return -1 and setting errno=12.
12 is ENOMEM:

    $ e=ENOMEM; perl -le "use Errno '$e';print $e"
    12

This is on fedora rawhide.

  $ rpm -q glibc
  glibc-2.7-2

Actual results: as above


Expected results: no segfault


Additional info:

Comment 1 Jim Meyering 2007-11-24 20:26:53 UTC
Actually, glibc *does* handle this condition.  You can see that if you repeat
the experiment above, but do "ret (void *)-1" instead.  In that case, the printf
and program complete normally.

Comment 2 Jim Meyering 2007-12-12 18:42:42 UTC
Hmm... don't know what changed, but now it fails again.
And this time I have debug symbols:

$ gdb -q ./a.out
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) b printf
Function "printf" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (printf) pending.
(gdb) r
Starting program: /t/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 0x2aaaaaac8b00 (LWP 13537)]
[Switching to Thread 0x2aaaaaac8b00 (LWP 13537)]

Breakpoint 1, __printf (format=0x400618 "foo") at printf.c:30
30      {
(gdb) b mmap64
Breakpoint 2 at 0x3c508d18d0
(gdb) c
Continuing.

Breakpoint 2, 0x0000003c508d18d0 in mmap64 () from /lib64/libc.so.6
(gdb) ret (void*)-1
Make selected stack frame return now? (y or n) y   

#0  0x0000003c508613db in _IO_file_doallocate (fp=<value optimized out>)
    at filedoalloc.c:120
120       ALLOC_BUF (p, size, EOF);
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
_IO_new_file_overflow (f=<value optimized out>, ch=<value optimized out>)
    at fileops.c:876
876       *f->_IO_write_ptr++ = ch;
(gdb) bt
#0  _IO_new_file_overflow (f=<value optimized out>, ch=<value optimized out>)
    at fileops.c:876
#1  0x0000003c5086ec34 in _IO_default_xsputn (f=<value optimized out>, 
    data=<value optimized out>, n=<value optimized out>) at genops.c:486
#2  0x0000003c5086d881 in _IO_new_file_xsputn (f=<value optimized out>, 
    data=<value optimized out>, n=<value optimized out>) at fileops.c:1370
#3  0x0000003c50842f50 in _IO_vfprintf_internal (s=<value optimized out>, 
    format=<value optimized out>, ap=<value optimized out>) at vfprintf.c:1301
#4  0x0000003c5084ca9a in __printf (format=0x66 <Address 0x66 out of bounds>)
    at printf.c:35
#5  0x000000000040050b in main () at k.c:2

Comment 3 Bug Zapper 2008-04-04 14:23:40 UTC
Based on the date this bug was created, it appears to have been reported
during the development of Fedora 8. In order to refocus our efforts as
a project we are changing the version of this bug to '8'.

If this bug still exists in rawhide, please change the version back to
rawhide.
(If you're unable to change the bug's version, add a comment to the bug
and someone will change it for you.)

Thanks for your help and we apologize for the interruption.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 4 Jim Meyering 2008-04-04 15:03:55 UTC
I've retested against today's rawhide, and see the same exact results, down to
line numbers in backtrace.

glibc-2.7.90-13.x86_64                                                           

Comment 5 Bug Zapper 2008-05-14 03:50:27 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Ulrich Drepper 2008-08-03 05:22:25 UTC
I see this also with the instructions you provide.  But there must be a problem with gdb.  Try this instead:

- put a breakpoint at the ALLOC_BUF macro use in filedoalloc.c

- run until breakpoint hit

- disp/i $pc

- use ni until _after_ the mmap call

- now use set $rax=0xffffffffffffffff   (equivalent for other archs, this ix x86-64)

- c

This works fine, the short buffer is used.

Comment 7 Jim Meyering 2009-01-18 21:46:28 UTC
Thanks.  However, gdb's "return" ought to work.  Having to simulate it with arch-specific register manipulations is not an option.  reassigning to gdb and reopening.

Comment 8 Jim Meyering 2009-01-18 21:47:39 UTC
P.S. I've just confirmed that this still afflicts F10.

Comment 9 Jan Kratochvil 2009-01-18 22:23:13 UTC
Created attachment 329309 [details]
Fix (w/o a testcase so far).

Comment 10 Jim Meyering 2009-01-24 17:39:00 UTC
Thanks!  I've applied it to the latest archer sources, built, and confirmed that the example above now works as expected.  i.e., no segfault.

Comment 11 Jan Kratochvil 2009-03-30 19:42:54 UTC
In FSF GDB HEAD and in Rawhide:
http://sourceware.org/ml/gdb-cvs/2009-03/msg00093.html
	gdb/
	* stack.c (return_command <retval_exp>): New variables retval_expr
	and old_chain.  Inline parse_and_eval to initialize retval_expr.  Check
	RETVAL_EXPR for UNOP_CAST and set RETURN_TYPE to the RETURN_VALUE type
	if RETURN_TYPE is NULL.
	
	gdb/doc/
	* gdb.texinfo (Returning): New description for missing debug info.
	
	gdb/testsuite/
	* gdb.base/return-nodebug.exp, gdb.base/return-nodebug.c: New.