Bug 217628 - Memory corruption when reading /proc/kcore
Memory corruption when reading /proc/kcore
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
2.1
ia64 Linux
medium Severity high
: ---
: ---
Assigned To: Don Howard
Brian Brock
http://marc.theaimsgroup.com/?t=11073...
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-11-28 19:30 EST by Don Howard
Modified: 2007-11-30 17:06 EST (History)
0 users

See Also:
Fixed In Version: RHSA-2007-0012
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-01-17 05:51:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Don Howard 2006-11-28 19:30:15 EST
+++ This bug was initially created as a clone of Bug #147666 +++

Description of problem:
Possible memory corruption when /proc/kcore is read


Version-Release number of selected component (if applicable):
2.4.9-e.57


How reproducible:
dd if=/proc/kcore of=/tmp/kcore bs=4k count=10
(if necessary, repeat a few times)

Steps to Reproduce:
see above
  
Actual results:
Various; usually the machine freezes after some /proc/kcore reads.

Expected results:
No problems, /proc/lcore is correctly read.

Additional info:
The problem is that the size of the kcore header is calculated incorrectly if
there are lots of VMAs. The reason is that the size of the data fields in the
ELF notes is not accounted for oin get_kcore_size() (fs/proc/kcore.c).


RH's Ernie Petrides has posted a patch for this to LKML.
http://marc.theaimsgroup.com/?t=110739734900006&r=1&w=2

It was accepted by Marcelo into 2.4 mainline.
http://linux.bkbits.net:8080/linux-2.4/cset@42024081gb19vludDwvjkxZjV0NvPg?nav=index.html|src/|src/fs|src/fs/proc|related/fs/proc/kcore.c

In 2.6 the problem has been fixed for 1.5 years.
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.69/2.5.69-mm9/broken-out/proc-kcore-rework.patch

BUG 141394 contains references to this problem for RHEL3.

-- Additional comment from Martin.Wilck@fujitsu-siemens.com on 2005-02-10 04:05
EST --
According to Ernie, this was accepted into the RHEL-U3 patch set.
The patch is pretty small and can hardly break stuff, so it'd be nice to see it
in AS2.1 ASAP, too.
Comment 3 Mike Gahagan 2006-12-20 10:25:50 EST
The system survives the reproducer using dd, however on two occations I have
killed the system with a:

cat /proc/kcore > /dev/null

Comment 4 Mike Gahagan 2006-12-20 10:38:12 EST
The failure seems to match the description if bz 213567. I can verify that the
changes that went into 213567 are in the e.64 kernel so I suspect that something
else is going on here.
Comment 5 Don Howard 2006-12-20 12:13:35 EST
The tell-tail for 213567 is that the cat process dies in read_kcore() when
trying to read un-mapped vmalloc()ed memory.

Derry does not use vmalloc() in proc_file_read(), so there must be a different
reason for the crash you see. (It could be some other use of vmalloc())

Can you collect a vmcore?
Comment 6 Don Howard 2006-12-20 12:27:10 EST
Also, cat /proc/kcore > /dev/null has the possiblity of touching read-volatile
memory.  In that case, a crash or hang *would* be expected.
Comment 7 Don Howard 2006-12-20 14:49:48 EST
cat of /proc/kcore results in immediate hang, hardware alarm sounds, and machine
reboot on my local zx2000.  This is true of kernels e.58, e.60, and e.64.  This
is not the same issue that I found in 213567, nor the issue addressed in this BZ.  

I strongly suspect that this is due to the senario mentioned above - reading of
random device registers.  
Comment 8 Marcel Holtmann 2006-12-21 11:18:51 EST
Can you please verify that the initially kernel we shipped would also hang on a
cat of /proc/kcore. If yes, then we need a separate bug report for it and it has
nothing to do with the current errata.
Comment 9 Mike Gahagan 2007-01-02 16:42:53 EST
Hi,

I just tried a 'cat /proc/kcore > /dev/null' using 2.4.18-e.12 (RHEL 2.1 for
ia64 GA kernel) and was able to hang the system. Unfortunately I have yet to get
a vmcore for any of these as it doesn't look like netdump works on itanium :(

I'll see about getting some serial console output.
 
Comment 10 Don Howard 2007-01-03 18:02:11 EST
Hi Mike -

You are correct, 2.1 does not have netdump support on ia64.  

I've tested this some more today, and I see hangs under rhel3 on ia64 with this
too. I'm pretty certain that the hang you have encountered is not related to the
issue addressed here.

Comment 13 Red Hat Bugzilla 2007-01-17 05:52:00 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0012.html

Note You need to log in before you can comment on or make changes to this bug.