Bug 474347 - [REG][5.3] Kernel panics when you prepare hugepages.
[REG][5.3] Kernel panics when you prepare hugepages.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
urgent Severity high
: rc
: ---
Assigned To: Jiri Pirko
Martin Jenner
: Regression, ZStream
: 473941 (view as bug list)
Depends On: 472802
Blocks:
  Show dependency treegraph
 
Reported: 2008-12-03 07:56 EST by RHEL Product and Program Management
Modified: 2015-05-04 21:15 EDT (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-12-16 02:34:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description RHEL Product and Program Management 2008-12-03 07:56:45 EST
This bug has been copied from bug #472802 and has been proposed
to be backported to 5.2 z-stream (EUS).
Comment 2 RHEL Product and Program Management 2008-12-03 08:21:23 EST
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.
Comment 5 Jiri Pirko 2008-12-05 09:18:33 EST
in kernel-2.6.18-92.1.22.el5
Comment 9 errata-xmlrpc 2008-12-16 02:34:26 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-1017.html
Comment 10 Linda Wang 2009-01-08 11:41:11 EST
*** Bug 473941 has been marked as a duplicate of this bug. ***
Comment 11 IBM Bug Proxy 2009-01-08 11:42:54 EST
added bugproxy@us.ibm.com to the cc list for mirroring to IBM


Ramon,

Can you reproduce this problem on a Power box with rhel5.3 snap4 by following  the steps below (posted by Larry) and post your findings asap?

To reproduce the problem I just ran these commands over and over until ths
system panic()'d:

# cat /proc/meminfo                         look for no hugepages allocated
# echo 100 > /proc/sys/vm/nr_hugepages      allocate 100 hugepages
# cat /proc/meminfo                         look for 100 hugepages allocated
# echo 0 > /proc/sys/vm/nr_hugepages        free the 100 hugepages

The system panic()'d within a few itterations without the patch but it stays up
forever with the patch applied.  The act of allocating hugepages overflows the
kernel stack and corrupts the memory below it so the system will crash as soon
as the overflow results in corruption that damages anything important.


Hi,

I was not able to reproduce this issue here.

System information:

[root@keechi-lp1 ~]# uname -a
Linux keechi-lp1.ltc.austin.ibm.com 2.6.18-124.el5 #1 SMP Mon Nov 17 16:58:59 EST 2008 ppc64 ppc64 ppc64 GNU/Linux
[root@keechi-lp1 ~]# cat /proc/meminfo
MemTotal:     33452928 kB
MemFree:      31227136 kB
Buffers:         97920 kB
Cached:         191488 kB
SwapCached:          0 kB
Active:         268928 kB
Inactive:       145664 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     33452928 kB
LowFree:      31227136 kB
SwapTotal:     1048448 kB
SwapFree:      1048448 kB
Dirty:             512 kB
Writeback:           0 kB
AnonPages:      124736 kB
Mapped:          47616 kB
Slab:           125440 kB
PageTables:       9024 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  16955712 kB
Committed_AS:   449856 kB
VmallocTotal: 8589934592 kB
VmallocUsed:     11776 kB
VmallocChunk: 8589921856 kB
HugePages_Total:   100
HugePages_Free:    100
HugePages_Rsvd:      0
Hugepagesize:    16384 kB

With 10.000 iterations of Larry's steps I could not trigger this issue:

Iteration: 10000
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Total: 100
HugePages_Free: 100
HugePages_Rsvd: 0

Best regards,


Mark,

Ramon could not reproduce this bug with rhel5.3 snap4 on a p 575 (with 32GB memory).
Can you check with RedHat to see if there any specific system setup/configuration was used when they  reproduced the bug.
This has been a difficult bug for me to reproduce as well, I've done it sucessfully on an ia64 system with a very large amount of memory. Try putting the system under a load which uses most of the memory (ltp-stress, sys_basher's memory test or something similar) then try the reproducer again.
IBM, it turns out this repro case only occurs on IA64 however it would be good to run POWER and x86/64 through the usual largepage testing to assure the patch does not affect any other functionality.
I found a little hugepage test program on lkml, I think the program itself is
buggy, but it along with toggling hugepages as described before will reproduce
the bug very quickly and easily with the -124 kernel on ia64. I couldn't reproduce it in about 30 minutes of running with the -125 kernel.

http://lkml.indiana.edu/hypermail/linux/kernel/0312.3/0258.html
*** This bug has been marked as a duplicate of 474347 ***

Note You need to log in before you can comment on or make changes to this bug.