From Bugzilla Helper: User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.2.19-6.2.1smp i686) Description of problem: Running 2.4.3-12enterprise I'm seeing fairly reproducible hangs on heavy memory usage. The exact point of failure is not reproducible though the mode of failure is consistent. How reproducible: Always Steps to Reproduce: Attempting to simulate load on the memory system using the c program below: #include <stdio.h> #include <stdlib.h> #include <sys/time.h> #include <unistd.h> #include <string.h> /* want to stress memory allocation and deallocation, preferably while * many other tests are going in the background */ #define MB (1024 * 1024) /* Sets maximum number of 1MB blocks to try to allocate */ /* Currently fixed to 256GB total */ #define MAX 2100 int main (int argc, char **argv) { struct timeval pre,post; int i; long TotalTime; void *MemBlocks[MAX]; long TimeDiff; for (;;) { /* allocate */ i=0; TotalTime = 0; gettimeofday(&pre, NULL); while ( (i < MAX) && ((MemBlocks[i] = malloc(MB)) != NULL)) { memset(MemBlocks[i],0,MB); /* write to memory to force the grab, otherwise it's a lazy grab in the kernel and won't be really allocated */ gettimeofday(&post, NULL); TimeDiff = ((post.tv_sec - pre.tv_sec) * 1000000); if (TimeDiff) { TimeDiff -= (pre.tv_usec - post.tv_usec); } else { TimeDiff = post.tv_usec - pre.tv_usec; } printf("Allocation of block %i succeeded and took %ld usec\n",i, TimeDiff); TotalTime += TimeDiff; i++; gettimeofday(&pre, NULL); } printf("Allocation of block %d failed. Deallocating all memory.\n",i); printf("Total of %ld usec spent in allocation\n",TotalTime); printf("Average of %ld usec/MB\n",TotalTime/i); fflush(stdout); sleep (5); /* deallocate */ i--; for (; i>=0; i--) { free(MemBlocks[i]); } printf("Done freeing, pausing before starting again.\n"); fflush(stdout); sleep (5); } return 0; } I'm running two instances of the resulting program with the designed effect of taking me deep into swap. I have 128MB of ram and 6 GB of swap enabled as a 2gb swap partition and 2 2gb swap files. Actual Results: Somewhere in the allocation loop, so far seen with both instances of the above code having allocated more than 1gb, the system hangs. I have seen both instances cycle once through allocation and deallocation and then fail in the second allocation loop. Additional info: The one bit of consistency I've seen so far is an alt-sysrq induced mem-info has consistently shown Free Pages: 1396kB (0 kB High Mem) The first memory line =512 kB (seen once as 2=256kb and the rest as 1=512kB) The second consistenly totals 884kB
How much ram and swap do you have ?
As stated in the above report 128 MB Ram 6GB Swap broken into 2GB Partition 2GB file 2GB file
I got this to reproduce on another system with 2.4.3-12enterprise 4x PIII 550 256 MB RAM 5GB Swap [root@dhcpd134 /root]# cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 261275648 237207552 24068096 49152 192446464 15745024 Swap: 953319424 0 953319424 MemTotal: 255152 kB MemFree: 23504 kB MemShared: 48 kB Buffers: 187936 kB Cached: 15376 kB Active: 203248 kB Inact_dirty: 112 kB Inact_clean: 0 kB Inact_target: 6112 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 255152 kB LowFree: 23504 kB SwapTotal: 5125280 kB SwapFree: 5125280 kB [root@dhcpd134 /root]# swapon -s Filename Type Size Used Priority /dev/sda2 partition 1028152 0 -1 /dev/sdb1 partition 2097136 0 -2 /swap/swap file 1999992 0 -3 The system locked with one memgobble having allocated 1849 blocks and the other 1798 I don't think this is a peak condition on either system as I have seen both memgobble processes finish the allocation loop and deallocate and then lock the system in their second allocation run.
As a point of reference, I ran the same load on the original system running 2.2.14-5 It's gone through 3 alloc and dealloc cycles so far without failing. Observation is that it is MUCH slower than 2.4, sometimes taking as much as 45 seconds to get a block of memory.
I have a similar problem with 2.4.3 (plain) kernel, running XMMS and Opera at the same time causes reproducable hang due to heavy swapping (memory hole in the kernel?). I used the 2.4.3 kernel with SGI XFS patches, but it should be very unlikely that this behaviour is related to the XFS patches.
Well, the XFS patches cause a totally different VM loadpattern, so this could very well be related to those patches.
I also see *similar* results with this running 2.4.16 on a four-processor Dell 8450 with 4 GB RAM and 2GB swap. It takes seconds to get memory; in a high- volume transaction system, this could effectively hang the server, which indeed is a result we see in at least one case.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/