Bug 208210
Summary: | Run Bonnie++ for memory test fail and system shows error message "out of memory" | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | allance <allance.chen> | ||||||||||||
Component: | kernel | Assignee: | Larry Woodman <lwoodman> | ||||||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | medium | ||||||||||||||
Version: | 4.0 | ||||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | x86_64 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2006-12-08 13:36:46 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
allance
2006-09-27 02:13:16 UTC
Pere, please attach the show_mem() output that was written to /var/log/messages when the OOM kill occurs. Larry Created attachment 137282 [details]
log file as OOM happen
Created attachment 137283 [details]
Display of killing process
Hi Larry, I have attached the information you want. Please help us to find
root-cuase and solution soon. Thanks.
The problem is that you are running the smp kernel on a 32GB system and thats not supported. Please install the hugemem kernel, reboot your system and that will fix the problem. Let me know the results as soon as you can do this. Larry Woodman We did install hugemem kernel, but the OOM still occured. You can refer to the description on the top of this bug ID. This problem happens on system with > 16GB memory, and only happens on RH4 32bit OS. Our system supports AMD opetron rev.F 1207 cpu + DDR-II 667 registered memory. Allance Chen Can you attach the logfile from the OOM kill when it was running the hugemem kernel? Thanks, Larry Woodman Created attachment 137635 [details]
logfile of OOM after installing hugmem kernel
The problem is that the system consumed most of lowmem with bounce buffers:
>>Normal free:640kB min:928kB low:1856kB high:2784kB active:1756kB inactive:1604kB
>>154963 bounce buffer pages
Please try "echo 100 > /proc/sys/vm/lower_zone_protection" to start reclaiming
lowmem earlier and see if that prevents the OOMkills.
Larry Woodman
OMMKills still happen by your comment, please check the attached log file. Created attachment 137722 [details]
log file after reclaim lowmem
Wait!!! none of these OOM kills are running a hugemem kernel. Please make sure yout /boot/grub/grub.conf file selects the "kernel-hugemem-2.6.9-,whatever>.EL". It is currently booting the smp kernel. Sorry I didnt notice that after your comment #7. Larry Woodman Created attachment 138215 [details]
log file for hugmem kernel load
Still failed as booting into hugmem kernel. Refer to attachment.
Once agian, this is due to bounce buffers.
>>>Oct 11 10:11:19 uut432 kernel: 776797 bounce buffer pages
1.) Does this only happen on 32-bit systems?
2.) Can you try setting /proc/sys/vm/dirty_ratio to 5 and rerunning the test?
1.) It doesn't happen on 32-bit systems. Only happen on RH4 64bit. 2.) It can be passed bonnie test after set dirty_ratio to 5. Could you explain in detail for the root-cuase? And, will this be implement in next update of RH4 64bit? Are you sure its the 64-bit systems that have the problem? All of the show_mem() outputs are from 32-bit kernels, in other words they have lowmem and highmem. In this case we cant do IO to highmem so we use lowmem bounce buffers and that exhausts lowmem before highmem and it cant be reclaimed. If you lower dirty_ratio from 40 to 5 the system starts writing out bounce buffers when 5% ot RAM is dirty instead of 40%. That prevents the system from getting into this state. Are you OK with this??? Larry Yes, we try many times on different systems/configurations in all RH 32/64bit OS. Only happened on RH4 64bit, can't find on RH3 64bit and other 32bit OS. Thanks for dubuggind and rootcausing this problem. But, I do believe this is an kernel bug, so will this issue be fixed in new kernel of RH? Please advise it. Will this workaround/solution be implement into new kernel or OS? *** This bug has been marked as a duplicate of 193542 *** |