This bug has been copied from bug #472802 and has been proposed to be backported to 5.2 z-stream (EUS).
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
in kernel-2.6.18-92.1.22.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-1017.html
*** Bug 473941 has been marked as a duplicate of this bug. ***
added bugproxy.com to the cc list for mirroring to IBM Ramon, Can you reproduce this problem on a Power box with rhel5.3 snap4 by following the steps below (posted by Larry) and post your findings asap? To reproduce the problem I just ran these commands over and over until ths system panic()'d: # cat /proc/meminfo look for no hugepages allocated # echo 100 > /proc/sys/vm/nr_hugepages allocate 100 hugepages # cat /proc/meminfo look for 100 hugepages allocated # echo 0 > /proc/sys/vm/nr_hugepages free the 100 hugepages The system panic()'d within a few itterations without the patch but it stays up forever with the patch applied. The act of allocating hugepages overflows the kernel stack and corrupts the memory below it so the system will crash as soon as the overflow results in corruption that damages anything important. Hi, I was not able to reproduce this issue here. System information: [root@keechi-lp1 ~]# uname -a Linux keechi-lp1.ltc.austin.ibm.com 2.6.18-124.el5 #1 SMP Mon Nov 17 16:58:59 EST 2008 ppc64 ppc64 ppc64 GNU/Linux [root@keechi-lp1 ~]# cat /proc/meminfo MemTotal: 33452928 kB MemFree: 31227136 kB Buffers: 97920 kB Cached: 191488 kB SwapCached: 0 kB Active: 268928 kB Inactive: 145664 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 33452928 kB LowFree: 31227136 kB SwapTotal: 1048448 kB SwapFree: 1048448 kB Dirty: 512 kB Writeback: 0 kB AnonPages: 124736 kB Mapped: 47616 kB Slab: 125440 kB PageTables: 9024 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 16955712 kB Committed_AS: 449856 kB VmallocTotal: 8589934592 kB VmallocUsed: 11776 kB VmallocChunk: 8589921856 kB HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Hugepagesize: 16384 kB With 10.000 iterations of Larry's steps I could not trigger this issue: Iteration: 10000 HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Best regards, Mark, Ramon could not reproduce this bug with rhel5.3 snap4 on a p 575 (with 32GB memory). Can you check with RedHat to see if there any specific system setup/configuration was used when they reproduced the bug. This has been a difficult bug for me to reproduce as well, I've done it sucessfully on an ia64 system with a very large amount of memory. Try putting the system under a load which uses most of the memory (ltp-stress, sys_basher's memory test or something similar) then try the reproducer again. IBM, it turns out this repro case only occurs on IA64 however it would be good to run POWER and x86/64 through the usual largepage testing to assure the patch does not affect any other functionality. I found a little hugepage test program on lkml, I think the program itself is buggy, but it along with toggling hugepages as described before will reproduce the bug very quickly and easily with the -124 kernel on ia64. I couldn't reproduce it in about 30 minutes of running with the -125 kernel. http://lkml.indiana.edu/hypermail/linux/kernel/0312.3/0258.html *** This bug has been marked as a duplicate of 474347 ***