Bug 97233 - malloc hangs using bigpages
Summary: malloc hangs using bigpages
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel
Version: 2.1
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL: http://www.redhat.com/whitepapers/rhe...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-06-11 18:15 UTC by rob lojek
Modified: 2007-11-30 22:06 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-07-15 21:36:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description rob lojek 2003-06-11 18:15:29 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b)
Gecko/20030516 Mozilla Firebird/0.6

Description of problem:
On a machine (compaq dl380 G-3) with 4 gB RAM, trying to malloc 2 gB of RAM with
bigpages enabled locks up at about 1.4 gB, though it eventually finishes in
50-70 seconds.

After disabling bigpages feature, the operation completes successfully in about
6 seconds, the same time as on non-bigpages-capable distributions (RH 7.x/8.0/9).

oracle and redhat recommend using bigpages to enhance oracle performance in
these docs:

http://www.redhat.com/whitepapers/rhel/OracleLinuxInstallTips.pdf
http://otn.oracle.com/tech/linux/pdf/1_linuxVM_v2_accepted.pdf

Version-Release number of selected component (if applicable):
2.4.9-e.24 and all other kernels

How reproducible:
Always

Steps to Reproduce:
1. intall stock RHAS on machine with 4 gB of RAM.
2. upgrade to e.24-enterprise kernel
3. add this line to rc.local:

## Bigpages -- check with 'cat /proc/meminfo'
echo 2 > /proc/sys/kernel/shm-use-bigpages      ## bigpages in shmfs

4. add this line to /etc/lilo.conf for the kernel you're going to boot:

        append="bigpages=2100MB"

5. run lilo ('lilo -v')
6. reboot
7. compile the attached source (gcc -o /usr/local/bin/slurpmem slurpmem.c)
8. run slurpmem like this: 'slurpmem 2000', which attempts to allocated 2 gB of
RAM via malloc system call.
9. box will freeze at about 1.5 gB, taking about 50 seconds to allocate RAM.

control case:
1. comment out the "append=" line in lilo.conf from step 4. above.
2. run lilo & reboot
3. repeat step 8.


Actual Results:  severe lock-up after about 1.4 gB of allocation

Expected Results:  no lock-up

Additional info:

Here's the c program that we use to reproduce the problem:

#include <stdio.h>
#include <string.h>

#define MEG ( 1024 * 1024 )


int main(int argc, char **argv) {

        char    **stored;
        int     megs;
        int i;


        if ( argc < 1 ) {

                printf("No argument specified.\n");
                exit(1);

        }

        megs = atoi(argv[1]);

        printf("%d megabytes will be slurped\n", megs);

        stored = (char **) malloc( sizeof( char* ) * megs );

        if ( stored < 0 ) {

                perror("malloc");
                exit(1);
        }

        printf("Megs zeroed: ");

        for ( i = 0; i < megs; i++ ) {

                stored[i] = (char *) malloc( MEG );

                if ( stored[i] < 0 ) {
                        perror("malloc");
                        exit(1);
                }

                memset( stored[i], 0, MEG - 1);

                printf("%d ", i);

        }

        printf("\n\n");
        printf("All alocated!\n");
        printf("Waiting for <control-c> to finish\n");
        scanf("%d", &i);

        exit(0);
}

Comment 1 Arjan van de Ven 2003-06-11 18:21:24 UTC
memory you set aside for bigpages is not available for normal use; so you
removed more than half of your ram -> result, your code uses more ram than you
have (left) -> swapping etc 

Comment 2 rob lojek 2003-06-11 18:30:44 UTC
there's absolutely no swapping going on--I'll post a 'free' in a minute.

There's also nothing at all running on the machine. Obviously, you should be
able to allocate 2 gB of RAM on a machine that has 4 gB of RAM installed--I the
OS is using 2 gB of RAM!

Here's a quote from this document on RH's site:
http://www.redhat.com/whitepapers/rhel/OracleLinuxInstallTips.pdf

"For a SGA of 4GB, bigpages of size 4100MB could be set and for SGA of 2GB,
bigpages of size 2100MB could be set."

We'd simply like to be able to follow this documentation, and have malloc() not
hang during a trivial memory allocation. We're having a tough time running
oracle using 'bigpages', per that document.

Comment 3 rob lojek 2003-06-11 18:31:35 UTC
sorry, that should read "the OS _isn't_ using 2 gB of RAM"

Comment 4 rob lojek 2003-06-11 18:35:13 UTC
Sorry, misread your original reply. Yes, there's indeed swapping going on, which
explains the delay at ~1.4 gB.


Note You need to log in before you can comment on or make changes to this bug.