Description of problem: On a RHEL3 system with the hugemem kernel and running Oracle 9i (9.2.0.5), we're attempting to set up a database with a ~6GB SGA, comprising about 2.1GB of shared pool (allocated from hugepages) and 4GB of database cache (allocated from /dev/shm, which is mounted as tmpfs). The database starts successfully and users can make connections successfully via SQL*Net. However, if a user signs on to the system in question and tries to run sqlplus directly and their memlock (ulimit -l) setting is "too low", the entire system will hang as soon as they enter a password. At this point no further logins are possible, even on the console. I put "too low" in quotes because it's not clear how high the value has to be; in my testing values down to 1000000 were ok, but 500000 caused the system to hang. Setting it to 4 (the default) hangs it every time. We discovered this bug because of bug 113335 (apparently ignored since it was filed), which causes a local user's memlock settings as specified in limits.conf to be ignored when they signed on via ssh. To trigger this bug it's also required to set "use_indirect_data_buffers = true" in the init.ora file for the database (which causes Oracle to use /dev/shm rather than process memory for that portion of the SGA). Version-Release number of selected component (if applicable): kernel-hugemem-2.4.21-15.0.3.EL How reproducible: See above. Steps to Reproduce: 1. See above. Actual results: Systems hangs. Expected results: System operates normally, or fails gracefully. Additional info: I should add that if the shared portion of the SGA is bumped up to about 2.5GB and hugetlb is in use, it's not even necessary to use sqlplus to hang the system--the system will hang as soon as the database is started. So among its other problems, hugetlb causes erratic behavior as process memory usage approaches the 2.7GB boundary. This seems like a distinct bug to me (though it may be related), but I won't be filing it separately. In addition to being a serious stability issue for Oracle 9i installations using large SGAs, this bug also allows any local user of such a system to hang that system, so it's potentially a serious denial of service attack as well. However, I think it's much more likely to happen through happenstance than it is to be used as an attack vector. The workaround so far is just to disable hugetlb. Given the issues with hugetlb noted in bug 127896, I have to say that it appears that the hugetlb implementation in RHEL3 is very unstable and should be avoided like the plague.
How much physical memory does your server have ?
Sorry: 8GB.
Martin, can try to reproduce this bug in our lab as soon as yuo get a chance? Thanks, Larry
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.