Bug 107806

Summary: Boot install Oops in "honours the WP bit" check
Product: [Retired] Red Hat Linux Reporter: Keith Wright <kwright>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: riel
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:41:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Keith Wright 2003-10-23 07:51:32 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5a) Gecko/20030728
Mozilla Firebird/0.6.1

Description of problem:
On certain old or cheap hardware the Linux kernel of RedHat 9
fails to boot, filling the screen with Oops dump
early in the process.  The tell-tail sign is
that it prints 

Checking if this processor honours the WP bit even in supervisor mode...

immediately before losing its cookies.

This problem occurs in Redhat version 9 kernels older
than Summer 2003, and is caused by a patch called
oopsmeharder.



Version-Release number of selected component (if applicable):
kernel-2.4.20-6

How reproducible:
Always

Steps to Reproduce:
1.Insert boot install floppy or CD
2.Press reset or power on
3.Watch it Dump Hex and Die
    

Actual Results:  Kernerl oops 0003

Expected Results:  Install

Additional info:

This bug is marked as closed, but it's not over till
the paperwork is done.  This is the same as bugs 85208
and 85908, which were "fixed" a while ago.
Unfortunately, the only way to know that is to debug it
yourself and search for the name of the patch that has
the problem.  In particular, the Via Mini-ITX system
has nothing much to do with the case.  I hope my title
and description will be easier to find than the rather
irrelevant titles on the previous bug reports.

It would be nice if this were mentioned among the bugs fixed
by the kernel updates. It is rather more important than
"obscure problems in foo driver".
There are still CDs on the market that have this problem
and it does not make a good first impression. 

Despite asking in comp.os.linux.redhat, much Google
search, and hours of browsing www.redhat.com, the
only way I learned about this was by reading the
kernel source, tracing the problem, testing my
own patch, and searching Bugzilla for "oopmeharder".

------
From: kwright (Programmer in Chief)
Newsgroups: comp.os.linux.redhat
Subject: Unable to page and Kernel oops on RH 9 install bootdisk.img

I have a few-years-old Cyrix-based machine that has had previous
versions of Redhat (7.1, 5.2) installed, as well as Suse and Debian.

When I try to install RH 9, the boot floppy gets half-booted and
fails with the message:
  Checking whether this processor honours the WP bit
       even in supervisor mode ...
  <1> Unable to handle kernel paging request at virtual
        address c0000000
  printing eip: c01143eb
  *pde = 00001063   *pte = 00000025
  Oops: 0003
       <etc, see below>

This also happens when I boot directly from CDROM.

A few months ago the following message was posted,
which gives exactly the same error message I see
from the Oops onward.  Does anybody know any more
about this?  Is the kernel patch from Redhat going
to fix this?

       -- Keith

> From: Average GG (redacted)
> Subject: red hat linux 9 panic during install on walmart $200 machine
> Date: 2003-05-26 19:21:04 PST
> 
> i've been running red hat linux 8 on my walmart $200 machine
> with no trouble for a while now.  i decided to install red hat 9
> on it and had a repeatable kernel panic during the install cd (#1) boot.
> i have not seen this posted before so i thought i'd send out
> a mention.
> if anyone else is seen this or knows what to do please post.
> 
> tks
> agg
> 
> here's the kernel panic info i wrote down:
> 
> oops: 0003
> cpu 0
> eip 0060:[<c01143eb>] not tainted
>     <clip>
 
I put in a patch to mm/fault.c to do some printk's as it handles the fault.
The upshot seems to be that Redhat's oopsmeharder patch has broken the
WP bit test.

The RH oopsmeharder patch does two things
  (1) adds an additional check that the page fault does
      not occur while a page fault is in progress.
  (2) removes a search of the exception_table which
      lets the fault go if its eip is in the table.

It seems that the first change is OK (at least in this case)
but the second one torques up the WP bit test.

Does Redhat know this?  Have they fixed it?

       -- Keith

PS: I have a big screen here, it's kind of a PITA to be forced to
type into a mailing lable sized window.

Comment 1 Bugzilla owner 2004-09-30 15:41:37 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/