112945 – "cat /dev/kmem > /dev/null" panics hugemem kernel

Bug 112945 - "cat /dev/kmem > /dev/null" panics hugemem kernel

Summary: "cat /dev/kmem > /dev/null" panics hugemem kernel

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Ernie Petrides
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-01-06 15:42 UTC by Eric Hagberg
Modified:	2007-11-30 22:06 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-12-20 20:54:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2004:550	0	normal	SHIPPED_LIVE	Updated kernel packages available for Red Hat Enterprise Linux 3 Update 4	2004-12-20 05:00:00 UTC

Description Eric Hagberg 2004-01-06 15:42:59 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030922

Description of problem:
When using kernel 2.4.21-6.ELhugemem or earlier for that matter,
simply running "cat /dev/kmem > /dev/null" will panic the machine.

This is not the case for enterprise kernels (in AS 2.1) or smp kernels
in RHEL3.

Other applications which appear to trigger this bug are openafs' kdump
and some veritas utilities.

Version-Release number of selected component (if applicable):
2.4.21-6.ELhugemem

How reproducible:
Always

Steps to Reproduce:
1. cat /dev/kmem > /dev/null
2.
3.
    

Actual Results:  Unable to handle kernel NULL pointer dereference at
virtual address 00000000
printing eip:
0215e9ea
*pde = 00003001
*pte = 00000000

hang.

Expected Results:  cat: /dev/kmem: read error [Bad address]

(this is what the RHAS 2.1 enterprise kernels do, for example)

Additional info:

Comment 1 Arjan van de Ven 2004-01-06 15:44:42 UTC

reading kmem randomly can have ALL sorts of bad sideeffects and is
severly discouraged!

Comment 2 Eric Hagberg 2004-01-06 15:54:32 UTC

Sure, but simply reading from /dev/kmem shouldn't be able to panic the
machine. No other OS I'm aware of has such a problem.

Comment 3 Arjan van de Ven 2004-01-06 15:56:38 UTC

on IA64 it will certainly do so (you are expected to get a bunch of
machine checks if you do this on several chipsets) and some x86
chipsets will too. Easiest sounds to just remove /dev/kmem; nothing
uses it anyway (unlike /dev/mem).

Comment 4 Eric Hagberg 2004-01-06 16:32:49 UTC

klogd (at least on inspection of strings) and openafs' kdump are at
least a couple things that do use /dev/kmem.

Comment 5 Arjan van de Ven 2004-01-06 16:35:53 UTC

klogd only does so for pre 2.0 kernels (before current modutils). I
can't imagine what kdump thinks to find from /dev/kmem really...
there's not a lot useful stuff in /dev/kmem at all.

Comment 6 Eric Hagberg 2004-01-06 16:41:04 UTC

It looks like Veritas (vxvm) may also have some probing of kmem, as we
were experiencing similar panics with the hugemem kernel, but none
under the smp one.

Comment 7 Arjan van de Ven 2004-01-06 17:03:31 UTC

kernel modules don't use /dev/kmem!
the hugemem kernel is different in another aspect though; it REQUIRES
that kernel modules follow the proper copy_from_user() API while
normal kernels sort of kinda usually work even when the API isn't
followed.
Crashes will be similar yes....

Comment 8 Eric Hagberg 2004-01-06 17:06:05 UTC

I know that kernel modules don't use /dev/kmem... machines would panic
when the veritas user-level processes needed to create and manage
volumes were run.

Comment 9 Arjan van de Ven 2004-01-06 17:13:30 UTC

that sounds more like a missing copy_from_user in veritas code than
using /dev/kmem. I can't imagine communicating with kernel space via
/dev/kmem; that's what ioctls and such are for which use copy_from_user().
It's actually a reasonable common bug in vendor code to forget to use
this API; some other OS's don't need it, and it mostly happens to work
in non-hugemem kernels (unless you happen to be under vm pressure).

Comment 10 Eric Hagberg 2004-01-06 21:09:16 UTC

saias83 /ms/user/h/hagberg 4# cat /dev/kmem > /dev/null
cat: /dev/kmem: Bad address
saias83 /ms/user/h/hagberg 5# uname -r
2.4.21-4.EL
saias83 /ms/user/h/hagberg 6# arch
ia64

Comment 11 Eric Hagberg 2004-01-09 19:15:19 UTC

/dev/mem also suffers a similar fate - "cat /dev/mem > /dev/null"
causes a hugemem-running machine to hang hard (no ping, no nothing)
w/o any panic or oops logged.

I don't see why you wouldn't want to put bounds checking on regions or
parts of /dev/kmem that shouldn't be accessed or that will cause
machine panic.

w/o such a fence around the kernel problem, the offending program(s)
can't really be debugged properly.

Comment 12 Eric Hagberg 2004-01-09 19:36:33 UTC

And crash opens /dev/kmem, so it isn't quite true to say that nothing
uses it. In fact there's an interesting note in there:

        /*
         *  On 32-bit architectures w/memory above ~936MB,
         *  that memory can only be accessed via vmalloc'd
         *  addresses.  However, /dev/mem returns 0 bytes,
         *  and non-reserved memory pages can't be mmap'd, so
         *  the only alternative is to read it from /dev/kmem.
         */

Comment 13 Eric Hagberg 2004-03-02 19:01:32 UTC

Any update on this?

Comment 14 Ernie Petrides 2004-03-06 01:56:57 UTC

I'll investigate this for RHEL 3 U3.

Comment 15 Ernie Petrides 2004-09-04 00:44:54 UTC

A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.3.EL).

Comment 16 John Flanagan 2004-12-20 20:54:50 UTC

An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html

Note You need to log in before you can comment on or make changes to this bug.