Bug 118098 - LTC6800-kernel oops with huge_page_release
LTC6800-kernel oops with huge_page_release
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
: FutureFeature
Depends On:
Blocks: 113479
  Show dependency treegraph
 
Reported: 2004-03-11 17:09 EST by IBM Bug Proxy
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-02 00:31:09 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description IBM Bug Proxy 2004-03-11 17:09:35 EST
The following has be reported by IBM LTC:  
kernel oops with huge_page_release
Hardware Environment:
x86 4-cpu box hyperthreaded (don't think this makes a diff)

Software Environment:
Red Hat EL AS 3 .. both -9.0.1-ELsmp and -11 (U2 beta kernel from 
ftp://partners.redhat.com/19443147e6de885277c0208e6fec70fe/2.4.21-11.EL/)

Steps to Reproduce:
1. run our db2 bucket ... when we finish creating a db, and are
detaching from
the shared memory segment, a kernel oops occurs.  SHM_HUGETLB has been
used in
the shmat call.
2. machine kernel oops at releasing the huge pages.

Actual Results:
EIP at huge_page_release 0x1e (2.4.21-0.9.1.ELsmp)

in the call stack:
unmap_hugepage_range
zap_hugepage_range
do_mumap
free_msg
sys_shmdt
sys_ipc
sys_gettimeofday

eax 36380500
ebx c27b0d84
ecx 00000005
edx c27b0d84
esi 1b200000
edi 1b400000
ebp eb9ca500
esp dbd79ee8

I can add more to this once I get the info.  This is stuff I collected 
yesterday, and is from the -9 kernel.

Expected Results:
it works as expeced/or an error message

I will see if I can come up with a standalone program to re-pro this.
 The 
version of DB2 that this happens on is not available for public
download yet.

Please send instructions on data to gather in order to debug this. 
Thanks.Glen / Mark  
 
U2 beta kernel problem.
Comment 1 Larry Woodman 2004-03-12 11:13:30 EST
Please attach the actual console screen when the OOPS occurs.

Thanks, Larry Woodman
Comment 3 Larry Woodman 2004-03-18 22:10:04 EST
Is this an x440 only problem or have you been able to reproduce it on
another x68 SMP system?

Larry Woodman
Comment 4 Larry Woodman 2004-03-18 22:48:51 EST
The problem is somehow one of the small pages in the compound bigpage
is getting placed on the active list rather than being treated as a
special subset of the hugepage.  When the system V shared memory
region that maps the hugepage is unmapped via shmdt the lowlevel vm
system recognizes this as corruption and the system BUGs.  Still
looking for the offending code, I suspect somewhere in the IO layer.

Larry
Comment 6 Larry Woodman 2004-06-18 11:45:39 EDT
Please try out this kernel ASAP.

http://people.redhat.com/~lwoodman/.for_yvonne/


Larry
Comment 9 Ernie Petrides 2004-06-20 09:39:47 EDT
A fix for this problem has just been committed to the RHEL3 U3
patch pool this evening (in kernel version 2.4.21-15.14.EL).
Comment 10 John Flanagan 2004-09-02 00:31:09 EDT
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-433.html

Note You need to log in before you can comment on or make changes to this bug.