Red Hat Bugzilla – Bug 217628
Memory corruption when reading /proc/kcore
Last modified: 2007-11-30 17:06:55 EST
+++ This bug was initially created as a clone of Bug #147666 +++
Description of problem:
Possible memory corruption when /proc/kcore is read
Version-Release number of selected component (if applicable):
dd if=/proc/kcore of=/tmp/kcore bs=4k count=10
(if necessary, repeat a few times)
Steps to Reproduce:
Various; usually the machine freezes after some /proc/kcore reads.
No problems, /proc/lcore is correctly read.
The problem is that the size of the kcore header is calculated incorrectly if
there are lots of VMAs. The reason is that the size of the data fields in the
ELF notes is not accounted for oin get_kcore_size() (fs/proc/kcore.c).
RH's Ernie Petrides has posted a patch for this to LKML.
It was accepted by Marcelo into 2.4 mainline.
In 2.6 the problem has been fixed for 1.5 years.
BUG 141394 contains references to this problem for RHEL3.
-- Additional comment from Martin.Wilck@fujitsu-siemens.com on 2005-02-10 04:05
According to Ernie, this was accepted into the RHEL-U3 patch set.
The patch is pretty small and can hardly break stuff, so it'd be nice to see it
in AS2.1 ASAP, too.
The system survives the reproducer using dd, however on two occations I have
killed the system with a:
cat /proc/kcore > /dev/null
The failure seems to match the description if bz 213567. I can verify that the
changes that went into 213567 are in the e.64 kernel so I suspect that something
else is going on here.
The tell-tail for 213567 is that the cat process dies in read_kcore() when
trying to read un-mapped vmalloc()ed memory.
Derry does not use vmalloc() in proc_file_read(), so there must be a different
reason for the crash you see. (It could be some other use of vmalloc())
Can you collect a vmcore?
Also, cat /proc/kcore > /dev/null has the possiblity of touching read-volatile
memory. In that case, a crash or hang *would* be expected.
cat of /proc/kcore results in immediate hang, hardware alarm sounds, and machine
reboot on my local zx2000. This is true of kernels e.58, e.60, and e.64. This
is not the same issue that I found in 213567, nor the issue addressed in this BZ.
I strongly suspect that this is due to the senario mentioned above - reading of
random device registers.
Can you please verify that the initially kernel we shipped would also hang on a
cat of /proc/kcore. If yes, then we need a separate bug report for it and it has
nothing to do with the current errata.
I just tried a 'cat /proc/kcore > /dev/null' using 2.4.18-e.12 (RHEL 2.1 for
ia64 GA kernel) and was able to hang the system. Unfortunately I have yet to get
a vmcore for any of these as it doesn't look like netdump works on itanium :(
I'll see about getting some serial console output.
Hi Mike -
You are correct, 2.1 does not have netdump support on ia64.
I've tested this some more today, and I see hangs under rhel3 on ia64 with this
too. I'm pretty certain that the hang you have encountered is not related to the
issue addressed here.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.