Bug 452759

Summary: kernel lockup when a kernel page fault occures.
Product: Red Hat Enterprise Linux 4 Reporter: Frank Ch. Eigler <fche>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: fche, luyu, lwoodman, mhiramat, prarit, tyamamot, vgoyal
Target Milestone: rc   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-06-25 17:19:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 451707    

Description Frank Ch. Eigler 2008-06-24 20:54:49 UTC
+++ This bug was initially created as a clone of Bug #435530 +++

Description of problem:
When a kernel module accesses 0xafffffffffffffff by using __get_user() macro,
kernel locked up.

Version-Release number of selected component (if applicable):
kernel-2.6.9-74.EL

How reproducible:
always

Steps to Reproduce:
1. build attached module(memacc.ko)
2. run /sbin/insmod memacc.ko on xen kernel.
3. wait 10 seconds(insmod is never returned)
4. kernel shows soft lockup message.

Actual results:
kernel hangs

See the attachment for bug #435530 for the test module source code.

Comment 1 Luming Yu 2008-06-25 07:15:55 UTC
how about rhel 5 kernel and upstream kernel with this test case?

Comment 2 Masami Hiramatsu 2008-06-25 14:09:34 UTC
(In reply to comment #1)
> how about rhel 5 kernel and upstream kernel with this test case?

on 2.6.18-94.el5, nothing happened.

however, as I reported on bug #435530, rhel5.2 xen kernel might have
same problem.

Comment 3 Prarit Bhargava 2008-06-25 14:18:35 UTC
So ... this a kernel-xen issue?

Comment 4 Masami Hiramatsu 2008-06-25 14:31:09 UTC
(In reply to comment #3)
> So ... this a kernel-xen issue?

No, this entry is for rhel4.7 kernel issue. kernel-2.6.9-74.EL has this issue.

Comment 5 Masami Hiramatsu 2008-06-25 16:32:12 UTC
I guess there is no pages mapped near the address of 0xafffffffffffxxxx on rhel4
kernel, and pagefault handler never returns.
the address is in region 5, and according to below document(p.26),
http://www.ia64-linux.org/doc/IA64linuxkernel.PDF
there seems no pages above 0xa00003ffffffffff.


Comment 6 Prarit Bhargava 2008-06-25 16:51:26 UTC
(In reply to comment #5)
> I guess there is no pages mapped near the address of 0xafffffffffffxxxx on rhel4
> kernel, and pagefault handler never returns.
> the address is in region 5, and according to below document(p.26),
> http://www.ia64-linux.org/doc/IA64linuxkernel.PDF
> there seems no pages above 0xa00003ffffffffff.
> 

I wonder what happens if you access a nonexistant page on x86.  I'm pretty sure
you would take an MCE ... the question is what should we do on ia64?

At a minimum, the process shouldn't hang.

P.

Comment 7 Prarit Bhargava 2008-06-25 17:19:07 UTC
So I was wondering "How can I determine whether or not it is valid to read from
the address supplied by a user in a module?"

I read through the kernel and noted the following comment:

/*
 * The "__xxx" versions do not do address space checking, useful when
 * doing multiple accesses to the same area (the programmer has to do the
 * checks by hand with "access_ok()")
 */
#define __put_user(x, ptr)      __put_user_nocheck((__typeof__(*(ptr))) (x),
(ptr), sizeof(*(ptr)))
#define __get_user(x, ptr)      __get_user_nocheck((x), (ptr), sizeof(*(ptr)))

Frank and Masami,

If I'm reading the above correctly, your code is incomplete.  The module as
currently written is basically doing what amounts to a NULL dereference (it is
interesting that the code hangs BTW).

I think the following code is better and correctly calls access_ok() before
attempting the __get_user (as is specified in the kernel):

static int initmod(void)
{
        int val=0;
        int * addr = (int*)0xafffffffffffffffLL; // kernel nonexist page
        /* kernel says user must do access_ok if __get_user is called */
        if (access_ok(VERIFY_WRITE, addr, KERNEL_DS)) {
                __get_user(val, addr);
        } else
                printk("access not ok\n");
        return 0;
}

The above module code will always fail on the access_ok check.

Closing as NOTABUG.

P.

Comment 8 Frank Ch. Eigler 2008-06-25 17:23:31 UTC
Prarit, why access_ok(... KERNEL_DS)?  This is user-space data we're pretending
to access.

Comment 9 Prarit Bhargava 2008-06-25 17:33:07 UTC
Yes, but you're still kernel-side.

__get_user() calls __get_user_nocheck() which calls __do_get_user(..., KERNEL_DS).

ie) KERNEL_DS is always the segment used when __get_user() is called.

P.



Comment 10 Frank Ch. Eigler 2008-06-25 17:43:17 UTC
I'm trying to test it for myself, but maybe you have an ia64 machine
you can do it upon yourself:

-   if (access_ok(VERIFY_WRITE, addr, KERNEL_DS)) {
+   if (access_ok(VERIFY_READ, addr, 4)) {

Does that work for you?


Comment 11 Prarit Bhargava 2008-06-25 18:00:45 UTC
I'm sure that will work but it doesn't matter what you set the segment value to
-- it will always get set to KERNEL_DS.

P.

Comment 12 Frank Ch. Eigler 2008-06-25 18:16:14 UTC
Thanks, confirmed: access_ok(...., {1, 0, 4}) all work as advertised.
So systemtap must be missing an access_ok() check where it is needed.


Comment 13 Prarit Bhargava 2008-06-25 18:24:07 UTC
Yes, that's what I would think.  The code is pretty explicit about stating the
requirement of access_ok() when using __get_user().

P.

Comment 14 Masami Hiramatsu 2008-06-25 19:08:08 UTC
Thank you, Prarit.

Frank, I tested that, access_ok() can work if the address must be user address.
But I think this can't apply to systemtap because it uses similar code of 
__get_user() for accessing kernel address.(kread)
Anyway, that is systemtap's bug. not a kernel bug.



Comment 15 Frank Ch. Eigler 2008-06-25 20:07:10 UTC
There is a kernel issue still in that systemtap would like to have some
mechanism to dereference arbitrary kernel addresses, with exception-style page
fault catching.  Something like the probe_kernel_* routines in recent kernels
could do the trick.  Prarit, do you happen to know of something already in
RHEL4.7 to satisfy that need?

Comment 16 Prarit Bhargava 2008-06-26 12:14:10 UTC
Nothing that I know of -- you might want to ping vgoyal as he might have a
better idea.

P.

Comment 17 Vivek Goyal 2008-06-26 21:55:03 UTC
Prarit,

I don't understand the IA64 code but here are my general thougts/queries.

- access_ok() just verifies that you are accessing a user space address (at
least on x86). So if I am trying to access a kernel address and pass it to
access_ok(), then it should say that you should not access this address. I think
that's what might be averting the problem here that we are  trying to access a
non-existent kernel address but access_ok() says no.

- But that does not take away the problem that If a module is trying to access a
nonexistent
 kernel address using __get_user(), then either the kernel should crash or
__get_user() should return -EFAULT. In this case page fault handler hangs so it
does sound like a bug. Hanging is not the solution. Either crash, or let fixup
code handle it.

- Frank mentioned that on x86_64, __get_user_xx() is allowing to poke at kernel
addresses also and returns -EFAULT. May be we can try to emulate the same
behavior on ia64. I am not aware if any of the functions allow that on ia64.

So I think this sounds like a but and should not be closed as NOTABUG. Page
fault handler for sure is misbehaving.