Bug 805645
Summary: | Kernell freeze under rapid allocation of memory | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bob Fries <BobsBugs052> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 16 | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-03-26 16:01:14 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
[mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. Seems to work as designed here. Once the system runs of memory, it kills the process hogging it all... [ 7264.912293] Out of memory: Kill process 4387 (a.out) score 545 or sacrifice child [ 7264.912919] Killed process 4387 (a.out) total-vm:7344156kB, anon-rss:4490552kB, file-rss:4kB Dave, Thanks for the pointers. I upgraded to the new kernel. It did change at least my perception of what was happening. The system was sluggish but it still showed signs of life. That made me wonder if I perhaps got duped into thinking the system had frozen when the problem was just that the X server was not updating the screen or responding to key strokes. So I went back to the previous kernel and ran from a text console. There I was able to still switch consoles and the one running htop showed updates. So I again tried running under X. It showed the symptoms I had seen before. I have been reluctant to just let the system run to see if it comes out of it on its own because the lockout could stop the fan control from responding to a hot component. So it might be that it would have recovered given enough time. Bottom line is that I was wrong about the freeze. However, something has changed between the two kernels that changes the perception under X of what has happened. The newer kernel allows the X server to update the screen occasionally. It is sluggish like might be expected on a heavily loaded machine but it still shows signs of life. The previous kernel left a situation where even ctrl-alt-Fx combinations seemed to be ignored. Much earlier I saw this same thing on a machine I was connected to through a network connection. That machine stopped responding also. So it is not just the X server that was affected. It seems that what happens is not a freeze so my title is inaccurate and the problem I was having is more of a lockout of processes. And that seems to have been mitigated with the current kernel. As far as I am concerned this bug report can be closed. -bob |
Created attachment 571801 [details] C program that will allocate and touch memory. Description of problem: Running two simultaneous instances of the attached program with the argument set to one below the number of giga-bytes of RAM in the system will cause a hard freeze. Version-Release number of selected component (if applicable): I have observed this behavior under various kernels using Fedora 15 and the current kernel I am using on Fedora 16 which is: Linux version 3.2.9-1.fc16.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.6.2 20111027 (Red Hat 4.6.2-1) (GCC) ) #1 SMP Thu Mar 1 01:41:10 UTC 2012 How reproducible: On hardware where it happens it is 100% reproducible. Steps to Reproduce: 1. Compile the attached c program gcc -o MemoryCrashProg MemoryCrashProg.c 2. Run two instances of the program with the argument set to one less than the number of gigs of RAM in the system. On my current thinkpad W520 system with 16 gigs of RAM run it as MemoryCrashProg 15 & MemoryCrashProg 15 3. The usage of memory by the program can be tracked by running htop in another window. The freeze will happen just as all of the physical RAM is used up Actual results: A hard system freeze where nothing responds. Expected results: One or both of the processes will be terminated when available resources are exceeded. Additional info: The program uses malloc to allocate the specified memory and then forces it to really be available by writing an integer to each location. I have seen the freeze happen on 3 different systems with 4, 8, and 48 processors and varying amounts of RAM and swap. Two systems I have had access to did not show this crash. One was a single processor 32 bit and the other was a dual core 64 bit machine. The /proc/meminfo for the current machine I see this on is MemTotal: 16317252 kB MemFree: 14243252 kB Buffers: 161732 kB Cached: 846476 kB SwapCached: 0 kB Active: 894748 kB Inactive: 788028 kB Active(anon): 686108 kB Inactive(anon): 149968 kB Active(file): 208640 kB Inactive(file): 638060 kB Unevictable: 3512 kB Mlocked: 3512 kB SwapTotal: 34832380 kB SwapFree: 34832380 kB Dirty: 276 kB Writeback: 0 kB AnonPages: 678132 kB Mapped: 172284 kB Shmem: 159240 kB Slab: 117280 kB SReclaimable: 59944 kB SUnreclaim: 57336 kB KernelStack: 3296 kB PageTables: 53700 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 42991004 kB Committed_AS: 1856096 kB VmallocTotal: 34359738367 kB VmallocUsed: 401436 kB VmallocChunk: 34359250620 kB HardwareCorrupted: 0 kB AnonHugePages: 348160 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 114688 kB DirectMap2M: 16553984 kB