From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6 Description of problem: On a machine (compaq dl380 G-3) with 4 gB RAM, trying to malloc 2 gB of RAM with bigpages enabled locks up at about 1.4 gB, though it eventually finishes in 50-70 seconds. After disabling bigpages feature, the operation completes successfully in about 6 seconds, the same time as on non-bigpages-capable distributions (RH 7.x/8.0/9). oracle and redhat recommend using bigpages to enhance oracle performance in these docs: http://www.redhat.com/whitepapers/rhel/OracleLinuxInstallTips.pdf http://otn.oracle.com/tech/linux/pdf/1_linuxVM_v2_accepted.pdf Version-Release number of selected component (if applicable): 2.4.9-e.24 and all other kernels How reproducible: Always Steps to Reproduce: 1. intall stock RHAS on machine with 4 gB of RAM. 2. upgrade to e.24-enterprise kernel 3. add this line to rc.local: ## Bigpages -- check with 'cat /proc/meminfo' echo 2 > /proc/sys/kernel/shm-use-bigpages ## bigpages in shmfs 4. add this line to /etc/lilo.conf for the kernel you're going to boot: append="bigpages=2100MB" 5. run lilo ('lilo -v') 6. reboot 7. compile the attached source (gcc -o /usr/local/bin/slurpmem slurpmem.c) 8. run slurpmem like this: 'slurpmem 2000', which attempts to allocated 2 gB of RAM via malloc system call. 9. box will freeze at about 1.5 gB, taking about 50 seconds to allocate RAM. control case: 1. comment out the "append=" line in lilo.conf from step 4. above. 2. run lilo & reboot 3. repeat step 8. Actual Results: severe lock-up after about 1.4 gB of allocation Expected Results: no lock-up Additional info: Here's the c program that we use to reproduce the problem: #include <stdio.h> #include <string.h> #define MEG ( 1024 * 1024 ) int main(int argc, char **argv) { char **stored; int megs; int i; if ( argc < 1 ) { printf("No argument specified.\n"); exit(1); } megs = atoi(argv[1]); printf("%d megabytes will be slurped\n", megs); stored = (char **) malloc( sizeof( char* ) * megs ); if ( stored < 0 ) { perror("malloc"); exit(1); } printf("Megs zeroed: "); for ( i = 0; i < megs; i++ ) { stored[i] = (char *) malloc( MEG ); if ( stored[i] < 0 ) { perror("malloc"); exit(1); } memset( stored[i], 0, MEG - 1); printf("%d ", i); } printf("\n\n"); printf("All alocated!\n"); printf("Waiting for <control-c> to finish\n"); scanf("%d", &i); exit(0); }
memory you set aside for bigpages is not available for normal use; so you removed more than half of your ram -> result, your code uses more ram than you have (left) -> swapping etc
there's absolutely no swapping going on--I'll post a 'free' in a minute. There's also nothing at all running on the machine. Obviously, you should be able to allocate 2 gB of RAM on a machine that has 4 gB of RAM installed--I the OS is using 2 gB of RAM! Here's a quote from this document on RH's site: http://www.redhat.com/whitepapers/rhel/OracleLinuxInstallTips.pdf "For a SGA of 4GB, bigpages of size 4100MB could be set and for SGA of 2GB, bigpages of size 2100MB could be set." We'd simply like to be able to follow this documentation, and have malloc() not hang during a trivial memory allocation. We're having a tough time running oracle using 'bigpages', per that document.
sorry, that should read "the OS _isn't_ using 2 gB of RAM"
Sorry, misread your original reply. Yes, there's indeed swapping going on, which explains the delay at ~1.4 gB.