Description of problem:
I use bonnie++1.03 to do memory test with 20GB system memory on
RH3.0 U7 32 bit (install Hugemem rpm), RH4.0 U3 64 bit, SLES10 32 bit,
and then all work well, only fail in RH4.0 U3 32 bit.
So, I think that it maybe RH4.0 U3 32 bit issue.
Besides, in RH4.0 U3 32 bit, system will kill other process to liberation
when system shows "Out of memory". That's not correct.
BTW, it's weird that RH3.0 U7 32 bit shows only 4GB system memory size
if I didn't install Hugemem rpm, but RH4.0 U3 32 bit always shows 20GB system
memory size regardless of Hugemem is install or not.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Pere, please attach the show_mem() output that was written to /var/log/messages
when the OOM kill occurs.
Created attachment 137282 [details]
log file as OOM happen
Created attachment 137283 [details]
Display of killing process
Hi Larry, I have attached the information you want. Please help us to find
root-cuase and solution soon. Thanks.
The problem is that you are running the smp kernel on a 32GB system and thats
not supported. Please install the hugemem kernel, reboot your system and that
will fix the problem. Let me know the results as soon as you can do this.
We did install hugemem kernel, but the OOM still occured. You can refer to the
description on the top of this bug ID. This problem happens on system with >
16GB memory, and only happens on RH4 32bit OS.
Our system supports AMD opetron rev.F 1207 cpu + DDR-II 667 registered
Can you attach the logfile from the OOM kill when it was running the hugemem kernel?
Thanks, Larry Woodman
Created attachment 137635 [details]
logfile of OOM after installing hugmem kernel
The problem is that the system consumed most of lowmem with bounce buffers:
>>Normal free:640kB min:928kB low:1856kB high:2784kB active:1756kB inactive:1604kB
>>154963 bounce buffer pages
Please try "echo 100 > /proc/sys/vm/lower_zone_protection" to start reclaiming
lowmem earlier and see if that prevents the OOMkills.
OMMKills still happen by your comment, please check the attached log file.
Created attachment 137722 [details]
log file after reclaim lowmem
Wait!!! none of these OOM kills are running a hugemem kernel. Please make sure
yout /boot/grub/grub.conf file selects the "kernel-hugemem-2.6.9-,whatever>.EL".
It is currently booting the smp kernel.
Sorry I didnt notice that after your comment #7.
Created attachment 138215 [details]
log file for hugmem kernel load
Still failed as booting into hugmem kernel. Refer to attachment.
Once agian, this is due to bounce buffers.
>>>Oct 11 10:11:19 uut432 kernel: 776797 bounce buffer pages
1.) Does this only happen on 32-bit systems?
2.) Can you try setting /proc/sys/vm/dirty_ratio to 5 and rerunning the test?
1.) It doesn't happen on 32-bit systems. Only happen on RH4 64bit.
2.) It can be passed bonnie test after set dirty_ratio to 5. Could you explain
in detail for the root-cuase? And, will this be implement in next update of
Are you sure its the 64-bit systems that have the problem? All of the
show_mem() outputs are from 32-bit kernels, in other words they have lowmem and
highmem. In this case we cant do IO to highmem so we use lowmem bounce buffers
and that exhausts lowmem before highmem and it cant be reclaimed. If you lower
dirty_ratio from 40 to 5 the system starts writing out bounce buffers when 5% ot
RAM is dirty instead of 40%. That prevents the system from getting into this
state. Are you OK with this???
Yes, we try many times on different systems/configurations in all RH 32/64bit
OS. Only happened on RH4 64bit, can't find on RH3 64bit and other 32bit OS.
Thanks for dubuggind and rootcausing this problem. But, I do believe this is
an kernel bug, so will this issue be fixed in new kernel of RH?
Please advise it.
Will this workaround/solution be implement into new kernel or OS?
*** This bug has been marked as a duplicate of 193542 ***