Bug 224679 - FEAT: Executing >2GB binaries with <2GB code but >2GB debug
FEAT: Executing >2GB binaries with <2GB code but >2GB debug
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Anderson
Brian Brock
: FutureFeature
Depends On:
Blocks: RHEL5u2_relnotes 393501 425461
  Show dependency treegraph
 
Reported: 2007-01-26 18:53 EST by Jan Kratochvil
Modified: 2008-05-21 10:41 EDT (History)
4 users (show)

See Also:
Fixed In Version: RHBA-2008-0314
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 10:41:06 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
.gz.bz2 of 2.5GB RHEL5.i386 ELF for convenience on memory-limited systems. (3.50 MB, application/octet-stream)
2007-12-06 11:21 EST, Jan Kratochvil
no flags Details

  None (edit)
Description Jan Kratochvil 2007-01-26 18:53:02 EST
Tested kernel-2.6.18-4.el5.x86_64 refuses to load the file >2GB even despite it
has very small code section (it has >2GB `.debug_macinfo' DWARF section).
  execve("./main", ["./main"], [/* 59 vars */]) = -1 EFBIG (File too large)

While loading >2GB code could be a major effort loading only standard <2GB
code/data/bss sizes on x86_64 should not be much a problem, only some file
offsets need to be 64bit.

Such functionality should apply even for i686.

While not directly requested by a customer it was found during Bug 222814
evaluation - the customer is using such >2GB debug data libraries (work fine).

-- Additional comment from jkratoch@redhat.com on 2007-01-22 18:39 EST --
Created an attachment (id=146256)
Testcase .tar.gz creating >2GB file with >2GB `.debug_macinfo' DWARF section.

Machine with >=5GB physical RAM is required for acceptable build time.


-- Additional comment from jkratoch@redhat.com on 2007-01-22 19:53 EST --
Created an attachment (id=146263)
.bz2.bz2 of 2.5GB RHEL5.x86_64 ELF for convenience on memory-limited systems.
Comment 2 Dave Anderson 2007-10-26 16:47:33 EDT
AFAICT, despite all of the discussion about the contents of the
attached "main" executable (the big .debug_macinfo' DWARF section,
etc...), the issue at hand appears to be simply a matter of file size.

Tinkering with kprobes, I found that when the failing sys_execve()
occurs with EFBIG, the function trace is this:

 sys_execve
  do_execve
   open_exec
    nameidata_to_filp
     __dentry_open
      generic_file_open  (ext3_file_operations.open)

and where generic_file_open() is returns EFBIG because the
inode's file size is greater than MAX_NON_LFS (2GB-1):
  
  /*
   * Called when an inode is about to be open.
   * We use this to disallow opening large files on 32bit systems if
   * the caller didn't specify O_LARGEFILE.  On 64bit systems we force
   * on this flag in sys_open.
   */
  int generic_file_open(struct inode * inode, struct file * filp)
  {
          if (!(filp->f_flags & O_LARGEFILE) && i_size_read(inode) > MAX_NON_LFS)
                  return -EFBIG;
          return 0;
  }
  
As the function's comment indicates, when the file is explicitly opened
by the sys_open() system call, O_LARGEFILE gets set:

  asmlinkage long sys_open(const char __user *filename, int flags, int mode)
  {
          long ret;
  
          if (force_o_largefile())
                  flags |= O_LARGEFILE;
  
          ret = do_sys_open(AT_FDCWD, filename, flags, mode);
          /* avoid REGPARM breakage on x86: */
          prevent_tail_call(ret);
          return ret;
  }

where force_o_largefile() looks like this:

  #define force_o_largefile() (BITS_PER_LONG != 32)
    
Anyway this would occur on any executable greater than 2GB in size.
For example, I took this program:

  main()
  {
          printf("hello world\n");
  }

compiled it into "hello":

  # hello
  hello world
  #

then did this:

  # cat main >> hello
  # ./hello
  -bash: ./hello: File too large
  #

I wrote a jprobe handler that catches the generic_file_open() function,
prints the inode and file pointer information, dumps the stack, and 
and then force-sets the O_LARGEFILE bit in the filp->f_flags:

  # insmod jprobe.ko
  # ./hello
  hello world
  # dmesg
  Planted jprobe at ffffffff810a1bd3, handler addr ffffffff88215000
  generic_file_open: inode=0xffff81004c58adc0, i_size: 2588871678 \
                     filp=0xffff81008a62f700 f_flags: 0

  Call Trace:
   [<ffffffff8821502f>] :jprobe:jdo_fork+0x2f/0x64
   [<ffffffff810a1f3a>] __dentry_open+0xd9/0x1b0
   [<ffffffff810a7401>] open_exec+0x76/0xc0
   [<ffffffff8109c115>] init_object+0x27/0x6e
   [<ffffffff8109dd18>] kmem_cache_alloc+0x7a/0xa0
   [<ffffffff810544bd>] trace_hardirqs_on+0x12e/0x151
   [<ffffffff810a8549>] do_execve+0x46/0x1f6
   [<ffffffff8100a61d>] sys_execve+0x36/0x4c
   [<ffffffff8100bff7>] stub_execve+0x67/0xb0

  pid: 10186 comm: bash (setting O_LARGEFILE)
  #

(BTW jprobes is pretty cool -- it's the first time I've ever used it...)

Anyway, it seems as simple as forcing the O_LARGEFILE in the sys_execv()
trail some place.  I wonder why nobody has ever complained about this
before.

Oh yeah -- my testing above was on an FC8 (2.6.23) kernel.
 
Comment 3 Dave Anderson 2007-10-30 15:40:03 EDT
> Anyway, it seems as simple as forcing the O_LARGEFILE in the sys_execv()
> trail some place.  I wonder why nobody has ever complained about this
> before.

My proposed RHEL5 linux-kernel-test.patch to open_exec(), the brew-built
kernel, and the kernel src.rpm can be found here:

  http://people.redhat.com/anderson/BZ_224679

Tested with the attached "main" executable.
Comment 11 Dave Anderson 2007-12-06 08:28:16 EST
> Such functionality should apply even for i686.

Jan,

Andi Kleen agrees with you there, i.e., that this should
also apply to 32-bit arches.

Can you create a >2GB i386 executable that I can test?

Thanks,
  Dave
 
Comment 12 Jan Kratochvil 2007-12-06 11:21:14 EST
Created attachment 279851 [details]
.gz.bz2 of 2.5GB RHEL5.i386 ELF for convenience on memory-limited systems.

But gcc.i386 is unable to create such file (used gcc.x86_64 -m32):
cc1: out of memory allocating 8016 bytes after a total of 925716480 bytes
Comment 13 Dave Anderson 2007-12-07 10:47:56 EST
The i386 test program runs fine with my posted patch.

Upstream, Andi Kleen extended my post to unconditionally
set O_LARGEFILE in open_exec(), and also in sys_uselib():

=========================================================

To: Dave Anderson <anderson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>, linux-kernel@vger.kernel.org
Subject: [NEW-PATCH] exec: allow > 2GB executables to run on 64-bit systems

Since Dave didn't post an updated patch. This is how I think what
the patch should be. I also changed sys_uselib just to be complete.

----

Always use O_LARGEFILE for opening executables

This allows to use executables >2GB.

Based on a patch by Dave Anderson

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux-2.6.24-rc3/fs/exec.c
===================================================================
--- linux-2.6.24-rc3.orig/fs/exec.c
+++ linux-2.6.24-rc3/fs/exec.c
@@ -119,7 +119,7 @@ asmlinkage long sys_uselib(const char __
 	if (error)
 		goto exit;
 
-	file = nameidata_to_filp(&nd, O_RDONLY);
+	file = nameidata_to_filp(&nd, O_RDONLY|O_LARGEFILE);
 	error = PTR_ERR(file);
 	if (IS_ERR(file))
 		goto out;
@@ -658,7 +658,8 @@ struct file *open_exec(const char *name)
 			int err = vfs_permission(&nd, MAY_EXEC);
 			file = ERR_PTR(err);
 			if (!err) {
-				file = nameidata_to_filp(&nd, O_RDONLY);
+				file = nameidata_to_filp(&nd,
+							O_RDONLY|O_LARGEFILE);
 				if (!IS_ERR(file)) {
 					err = deny_write_access(file);
 					if (err) {
Comment 15 Don Zickus 2008-01-10 15:39:40 EST
in 2.6.18-66.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 17 Don Domingo 2008-02-05 21:36:06 EST
added to RHEl5.2 release notes under "Kernel-Related Updates":

<quote>
Executing binaries with more than 2GB of debug information no longer fails.
</quote>

please advise if any further revisions are required. thanks!
Comment 18 Dave Anderson 2008-02-06 08:53:08 EST
(In reply to comment #17)
> added to RHEl5.2 release notes under "Kernel-Related Updates":
> 
> <quote>
> Executing binaries with more than 2GB of debug information no longer fails.
> </quote>
> 
> please advise if any further revisions are required. thanks!

It does not require 2GB of *debug* information -- it's simply a matter of
the file size of 64-bit binaries.  So you could say something like:

<quote>
Executing 64-bit binaries greater than 2GB no longer fails.
</quote>


Comment 19 Jan Kratochvil 2008-02-06 13:23:19 EST
While technically right I would find it more misleading as RHEL still does not
support general executables with >2GB of code/data as GCC cannot produce it.

See man gcc, -mcmodel=large:
  Generate code for the large model: This model makes no assumptions about
  addresses and sizes of sections.  Currently GCC does not implement this model.
Linking fails as everything is using R_X86_64_32 / R_X86_64_PC32 by the model:
-mcmodel=small:
  Generate code for the small code model: the program and its symbols must be
  linked in the lower 2 GB of the address space.  Pointers are 64 bits.
  Programs can be statically or dynamically linked.
  This is the default code model.
Comment 20 Dave Anderson 2008-02-06 13:37:37 EST
(In reply to comment #19)
> While technically right I would find it more misleading as RHEL still does not
> support general executables with >2GB of code/data as GCC cannot produce it.

Point taken -- I was just looking at it from the kernel point of view, which
doesn't care about which parts of the binary cause the "> 2gb size".

Anyway, I rescind my release note suggestion.
Comment 21 Don Domingo 2008-04-01 22:16:40 EDT
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don
Comment 23 errata-xmlrpc 2008-05-21 10:41:06 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.