Bug 113890 - [PATCH] Excutable compiled on x86 can cause kernel seg fault on x86_64
[PATCH] Excutable compiled on x86 can cause kernel seg fault on x86_64
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Jim Paradis
Brian Brock
:
Depends On:
Blocks: 107562
  Show dependency treegraph
 
Reported: 2004-01-19 17:23 EST by Dale Mosby
Modified: 2013-08-05 21:03 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-05-11 21:08:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
File self_test.c shown in example. (2.14 KB, text/plain)
2004-01-19 17:27 EST, Dale Mosby
no flags Details
Patch file to fix. (1.72 KB, patch)
2004-01-19 17:27 EST, Dale Mosby
no flags Details | Diff
Updated patch (3.12 KB, patch)
2004-01-26 21:26 EST, Bernd Schmidt
no flags Details | Diff

  None (edit)
Description Dale Mosby 2004-01-19 17:23:24 EST
Description of problem:

A program compiled on an x86 environment that is moved to an
AMD x86_64 (Opteron) system can cause an kernel seg fault. This
happens in cases when the program attempts to dump core. The
problem is in file fs/binfmt_elf.c in which array element
notes[3] is not filled and yet is referenced. This array is placed
on the stack. Should a zero be at that location on the stack a
seg fault will occur. When you get the seg fault is happens
all the time. If you recompile the kernel you may no longer
hit this - though garbage data on the stack will still be used.

Version-Release number of selected component (if applicable):

2.4.21-1.1931.2.393.entsmp

How reproducible:

Always

Steps to Reproduce:
1. On an x86 system compile the test program like so:
   gcc self_test.c -o self_test -lpthread
2. Copy the executable to an Opteron box and execute like so:
   ulimit -c unlimited
   LD_ASSUME_KERNEL=2.4.1 self_test 1
  
Actual results:
System may take a seg fault and generate an Oops message, depending
on how data lands on stack.

Expected results:
Test program should core dump

Additional info:

The problem is in file "fs/binfmt_elf.c".
The following line of code in procedure "elf_core_dump" may pass
an uninitialized value:
    sz += notesize(&notes[i]);

This was analyzed examining a stack dump and observing the following
instruction as the first instruction of strlen:
    cmpb   $0x0,(%rdi)

This was executed with %rdi as null. The failure may cause a seg
fault or may simply use a garbage value - this is entirely dependent
on stack contents as "notes[]" is on the stack in this routine.

Here is an excerpt of the current code followed by a re-write which
fixes the problem and is simpler code as well. (The actual code in
file is much longer. I removed all but the incorrect code to
illustrate error.)
----- bad code ---------------------------
static int elf_core_dump(long signr, struct pt_regs * regs, struct 
file * file)
{
	...
	int numnote = 5;
	struct memelfnote notes[5];

	fill_note(&notes[0], "CORE", NT_PRSTATUS, sizeof(prstatus), 
&prstatus);
	fill_note(&notes[1], "CORE", NT_PRPSINFO, sizeof(psinfo), 
&psinfo);
	fill_note(&notes[2], "CORE", NT_TASKSTRUCT, sizeof(*current), 
current);

#ifndef __x86_64__
  	/* Try to dump the FPU. */
	if ((prstatus.pr_fpvalid = elf_core_copy_task_fpregs(current, 
&fpu))) {
		fill_note(&notes[3], "CORE", NT_PRFPREG, sizeof(fpu), 
&fpu);
	} else {
		--numnote;
 	}
#else
	numnote --;
#endif 	
#ifdef ELF_CORE_COPY_XFPREGS
	if (elf_core_copy_task_xfpregs(current, &xfpu)) {
		fill_note(&notes[4], "LINUX", NT_PRXFPREG, sizeof
(xfpu), xfpu);
	} else {
		--numnote;
	}
#else
	numnote --;
#endif

	for(i = 0; i < numnote; i++)
		sz += notesize(&notes[i]);
------------------------------------------
The problem is with the two ifdef sections. On the AMD x86_64 the 
first ifdef
will be skipped. This leaves notes[3] uninitialized. But the second 
ifdef is
not skipped, thus notes[4] is filled in and the value of "numnote" is 
4.
This means that the for loop will call "notesize(&notes[3])" 
resulting in
notesize calling strlen on an element of a structure that was never
initialized. If the stack contained zero we get a seg fault, otherwise
garbage data is used. The fix follows. It has the advantage of making 
the
code simpler.
----- fixed code -------------------------
	int numnote = 0;
	struct memelfnote notes[5];

	fill_note(&notes[numnotes++], "CORE", NT_PRSTATUS, sizeof
(prstatus), 
&prstatus);
	fill_note(&notes[numnotes++], "CORE", NT_PRPSINFO, sizeof
(psinfo), 
&psinfo);
	fill_note(&notes[numnotes++], "CORE", NT_TASKSTRUCT, sizeof
(*current), 
current);

#ifndef __x86_64__
  	/* Try to dump the FPU. */
	if ((prstatus.pr_fpvalid = elf_core_copy_task_fpregs(current, 
&fpu))) {
		fill_note(&notes[numnotes++], "CORE", NT_PRFPREG, 
sizeof(fpu), 
&fpu);
	}
#endif 	
#ifdef ELF_CORE_COPY_XFPREGS
	if (elf_core_copy_task_xfpregs(current, &xfpu)) {
		fill_note(&notes[numnotes++], "LINUX", NT_PRXFPREG, 
sizeof
(xfpu), &xfpu);
	}
#endif
------------------------------------------
Comment 1 Dale Mosby 2004-01-19 17:27:00 EST
Created attachment 97109 [details]
File self_test.c shown in example.
Comment 2 Dale Mosby 2004-01-19 17:27:59 EST
Created attachment 97110 [details]
Patch file to fix.
Comment 3 Bernd Schmidt 2004-01-26 21:26:53 EST
Created attachment 97262 [details]
Updated patch

Seems like a similar problem exists in elf_dump_thread_status.	This patch also
fixes that instance.

I can't really test due to lack of a hammer box, but it does compile.
Comment 4 Jim Paradis 2004-01-27 17:12:53 EST
The second patch looks good to me and seems to work as advertised (the
core dump is gdb'able as well).  I'll run it by the elf/gdb folks just
for sanity's sake then submit it for U2.
Comment 6 Jim Paradis 2004-02-17 12:28:00 EST
Patch slated for U2
Comment 7 Ernie Petrides 2004-02-17 17:58:57 EST
The fix for this problem was committed to the RHEL3 U2 patch pool
on Thursday, 12-Feb-2004, for kernel version 2.4.21-9.8.
Comment 8 Don Howard 2004-04-13 21:46:25 EDT
*** Bug 117941 has been marked as a duplicate of this bug. ***
Comment 9 John Flanagan 2004-05-11 21:08:21 EDT
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-188.html

Note You need to log in before you can comment on or make changes to this bug.