Bug 113890 - [PATCH] Excutable compiled on x86 can cause kernel seg fault on x86_64
Summary: [PATCH] Excutable compiled on x86 can cause kernel seg fault on x86_64
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jim Paradis
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 107562
TreeView+ depends on / blocked
 
Reported: 2004-01-19 22:23 UTC by Dale Mosby
Modified: 2013-08-06 01:03 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-05-12 01:08:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
File self_test.c shown in example. (2.14 KB, text/plain)
2004-01-19 22:27 UTC, Dale Mosby
no flags Details
Patch file to fix. (1.72 KB, patch)
2004-01-19 22:27 UTC, Dale Mosby
no flags Details | Diff
Updated patch (3.12 KB, patch)
2004-01-27 02:26 UTC, Bernd Schmidt
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2004:188 0 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 2 2004-05-11 04:00:00 UTC

Description Dale Mosby 2004-01-19 22:23:24 UTC
Description of problem:

A program compiled on an x86 environment that is moved to an
AMD x86_64 (Opteron) system can cause an kernel seg fault. This
happens in cases when the program attempts to dump core. The
problem is in file fs/binfmt_elf.c in which array element
notes[3] is not filled and yet is referenced. This array is placed
on the stack. Should a zero be at that location on the stack a
seg fault will occur. When you get the seg fault is happens
all the time. If you recompile the kernel you may no longer
hit this - though garbage data on the stack will still be used.

Version-Release number of selected component (if applicable):

2.4.21-1.1931.2.393.entsmp

How reproducible:

Always

Steps to Reproduce:
1. On an x86 system compile the test program like so:
   gcc self_test.c -o self_test -lpthread
2. Copy the executable to an Opteron box and execute like so:
   ulimit -c unlimited
   LD_ASSUME_KERNEL=2.4.1 self_test 1
  
Actual results:
System may take a seg fault and generate an Oops message, depending
on how data lands on stack.

Expected results:
Test program should core dump

Additional info:

The problem is in file "fs/binfmt_elf.c".
The following line of code in procedure "elf_core_dump" may pass
an uninitialized value:
    sz += notesize(&notes[i]);

This was analyzed examining a stack dump and observing the following
instruction as the first instruction of strlen:
    cmpb   $0x0,(%rdi)

This was executed with %rdi as null. The failure may cause a seg
fault or may simply use a garbage value - this is entirely dependent
on stack contents as "notes[]" is on the stack in this routine.

Here is an excerpt of the current code followed by a re-write which
fixes the problem and is simpler code as well. (The actual code in
file is much longer. I removed all but the incorrect code to
illustrate error.)
----- bad code ---------------------------
static int elf_core_dump(long signr, struct pt_regs * regs, struct 
file * file)
{
	...
	int numnote = 5;
	struct memelfnote notes[5];

	fill_note(&notes[0], "CORE", NT_PRSTATUS, sizeof(prstatus), 
&prstatus);
	fill_note(&notes[1], "CORE", NT_PRPSINFO, sizeof(psinfo), 
&psinfo);
	fill_note(&notes[2], "CORE", NT_TASKSTRUCT, sizeof(*current), 
current);

#ifndef __x86_64__
  	/* Try to dump the FPU. */
	if ((prstatus.pr_fpvalid = elf_core_copy_task_fpregs(current, 
&fpu))) {
		fill_note(&notes[3], "CORE", NT_PRFPREG, sizeof(fpu), 
&fpu);
	} else {
		--numnote;
 	}
#else
	numnote --;
#endif 	
#ifdef ELF_CORE_COPY_XFPREGS
	if (elf_core_copy_task_xfpregs(current, &xfpu)) {
		fill_note(&notes[4], "LINUX", NT_PRXFPREG, sizeof
(xfpu), xfpu);
	} else {
		--numnote;
	}
#else
	numnote --;
#endif

	for(i = 0; i < numnote; i++)
		sz += notesize(&notes[i]);
------------------------------------------
The problem is with the two ifdef sections. On the AMD x86_64 the 
first ifdef
will be skipped. This leaves notes[3] uninitialized. But the second 
ifdef is
not skipped, thus notes[4] is filled in and the value of "numnote" is 
4.
This means that the for loop will call "notesize(&notes[3])" 
resulting in
notesize calling strlen on an element of a structure that was never
initialized. If the stack contained zero we get a seg fault, otherwise
garbage data is used. The fix follows. It has the advantage of making 
the
code simpler.
----- fixed code -------------------------
	int numnote = 0;
	struct memelfnote notes[5];

	fill_note(&notes[numnotes++], "CORE", NT_PRSTATUS, sizeof
(prstatus), 
&prstatus);
	fill_note(&notes[numnotes++], "CORE", NT_PRPSINFO, sizeof
(psinfo), 
&psinfo);
	fill_note(&notes[numnotes++], "CORE", NT_TASKSTRUCT, sizeof
(*current), 
current);

#ifndef __x86_64__
  	/* Try to dump the FPU. */
	if ((prstatus.pr_fpvalid = elf_core_copy_task_fpregs(current, 
&fpu))) {
		fill_note(&notes[numnotes++], "CORE", NT_PRFPREG, 
sizeof(fpu), 
&fpu);
	}
#endif 	
#ifdef ELF_CORE_COPY_XFPREGS
	if (elf_core_copy_task_xfpregs(current, &xfpu)) {
		fill_note(&notes[numnotes++], "LINUX", NT_PRXFPREG, 
sizeof
(xfpu), &xfpu);
	}
#endif
------------------------------------------

Comment 1 Dale Mosby 2004-01-19 22:27:00 UTC
Created attachment 97109 [details]
File self_test.c shown in example.

Comment 2 Dale Mosby 2004-01-19 22:27:59 UTC
Created attachment 97110 [details]
Patch file to fix.

Comment 3 Bernd Schmidt 2004-01-27 02:26:53 UTC
Created attachment 97262 [details]
Updated patch

Seems like a similar problem exists in elf_dump_thread_status.	This patch also
fixes that instance.

I can't really test due to lack of a hammer box, but it does compile.

Comment 4 Jim Paradis 2004-01-27 22:12:53 UTC
The second patch looks good to me and seems to work as advertised (the
core dump is gdb'able as well).  I'll run it by the elf/gdb folks just
for sanity's sake then submit it for U2.


Comment 6 Jim Paradis 2004-02-17 17:28:00 UTC
Patch slated for U2


Comment 7 Ernie Petrides 2004-02-17 22:58:57 UTC
The fix for this problem was committed to the RHEL3 U2 patch pool
on Thursday, 12-Feb-2004, for kernel version 2.4.21-9.8.


Comment 8 Don Howard 2004-04-14 01:46:25 UTC
*** Bug 117941 has been marked as a duplicate of this bug. ***

Comment 9 John Flanagan 2004-05-12 01:08:21 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-188.html



Note You need to log in before you can comment on or make changes to this bug.