The following has be reported by IBM LTC:
kernel oops with huge_page_release
x86 4-cpu box hyperthreaded (don't think this makes a diff)
Red Hat EL AS 3 .. both -9.0.1-ELsmp and -11 (U2 beta kernel from
Steps to Reproduce:
1. run our db2 bucket ... when we finish creating a db, and are
the shared memory segment, a kernel oops occurs. SHM_HUGETLB has been
the shmat call.
2. machine kernel oops at releasing the huge pages.
EIP at huge_page_release 0x1e (2.4.21-0.9.1.ELsmp)
in the call stack:
I can add more to this once I get the info. This is stuff I collected
yesterday, and is from the -9 kernel.
it works as expeced/or an error message
I will see if I can come up with a standalone program to re-pro this.
version of DB2 that this happens on is not available for public
Please send instructions on data to gather in order to debug this.
Thanks.Glen / Mark
U2 beta kernel problem.
Please attach the actual console screen when the OOPS occurs.
Thanks, Larry Woodman
Is this an x440 only problem or have you been able to reproduce it on
another x68 SMP system?
The problem is somehow one of the small pages in the compound bigpage
is getting placed on the active list rather than being treated as a
special subset of the hugepage. When the system V shared memory
region that maps the hugepage is unmapped via shmdt the lowlevel vm
system recognizes this as corruption and the system BUGs. Still
looking for the offending code, I suspect somewhere in the IO layer.
Please try out this kernel ASAP.
A fix for this problem has just been committed to the RHEL3 U3
patch pool this evening (in kernel version 2.4.21-15.14.EL).
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.