Bug 130843 (IT_47550) - not-present translations for region 5(vmalloc'd area) not handled
Summary: not-present translations for region 5(vmalloc'd area) not handled
Alias: IT_47550
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 3.0
Hardware: ia64
OS: Linux
Target Milestone: ---
Assignee: Dave Anderson
QA Contact: Brian Brock
URL: http://linux.bkbits.net:8080/linux-2....
Depends On:
Blocks: 123574
TreeView+ depends on / blocked
Reported: 2004-08-25 02:39 UTC by Suresh Siddha
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-12-20 20:55:59 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Patch handling not-present faults for region 5 (1.35 KB, patch)
2004-08-25 02:41 UTC, Suresh Siddha
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:550 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 3 Update 4 2004-12-20 05:00:00 UTC

Description Suresh Siddha 2004-08-25 02:39:02 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

Description of problem:
This bug is fixed in latest 2.4 and 2.6 bktrees.

We might get into page fault handler even if the region 5 address is 
valid, due to the VHPT walker inserting a non present translation 
that becomes stale. And as page fault handler in EL3 doesn't handle 
not-present translations for region 5, it will oops.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Kernel will fail to boot if there is lot of interrupt activity 
handled by the modules(vmalloc'd text)

Additional info:

Comment 1 Suresh Siddha 2004-08-25 02:41:08 UTC
Created attachment 103049 [details]
Patch handling not-present faults for region 5

Patch is straight from bkbits.


Comment 2 Larry Woodman 2004-08-27 11:34:17 UTC
Suresh, pardon my ignorance here but how does this happen?  If the
kernel only performs atomic updates to the ptes(never clears one bit
at a time leaving the pte in an inconsistant state) how does the
VHPTwalker insert a TLBentry thats half-baked?  If there are cases
that the pte is in some inconsistant/interm state, should we fix that

Thanks, Larry 

Comment 3 Suresh Siddha 2004-08-27 17:21:04 UTC
Here is the failing sequence

t0: On cpu1, while the kernel is servicing requests from driver 
module A, hardware VHPT walker inserts the empty pte's(page not 
present entries) around the module code address 'A' into the TLB's

t1: On cpu0, as part of loading new module 'B', vmalloc_area_pages() 
sets up the pte's for module 'B' in swapper_pg_dir without doing 
flush_tlb_all() (This is OK because we do flush_tlb_all() in 
vmfree_area_pages()). But this module 'B' address happens to be same 
as the empty pte's(page not present entries) that got loaded onto 
cpu1 tlbs in step 't0' above.

t2: When the module 'B' code starts executing on cpu1, because of 
page not present entries in cpu1's TLB it gets a page_not_present 
fault. And as the page_fault handler doesn't handle  faults in 
region '5' it simply oops.

As page_not_present handler purges the corresponding not present TLB 
entry, next page rewalk will succeed.

Comment 4 Larry Woodman 2004-08-27 19:33:38 UTC


Comment 5 Dave Anderson 2004-09-13 18:09:52 UTC
Either my patch:


or Norm Murray's patch:


will address this issue.  Norm's was generated from an LLNL IT, but
is identical except for the addition of a KERN_CRIT to the beginning
of a printk() in do_page_fault().

Comment 8 Suresh Siddha 2004-09-14 01:25:08 UTC
I can't access the above mentioned post-office URL. Please let me 
know if you need any more info or if you think patch posted in 
comment #1 isn't enough

Comment 9 Ernie Petrides 2004-09-14 01:51:12 UTC
Hi, Suresh.  The URLs in comment #5 are restricted to Red Hat.
A minor variation of your patch (due to a RHEL3 porting issue)
is on track for U4.  I'll update this bug report when the patch
is committed (in the next day or two).

Thanks for isolating the problem and providing the patch.

Comment 10 Ernie Petrides 2004-09-15 00:10:22 UTC
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.6.EL).

Comment 11 John Flanagan 2004-12-20 20:55:59 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.