Bug 149011 - Oracle 8 import of Oracle 9 database can lock system.
Oracle 8 import of Oracle 9 database can lock system.
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
Depends On:
Blocks: 156320
  Show dependency treegraph
Reported: 2005-02-17 16:56 EST by Hisashi T Fujinaka
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version: RHSA-2005-663
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-09-28 10:47:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
AltSysRqM output (8.46 KB, text/plain)
2005-03-01 14:03 EST, Hisashi T Fujinaka
no flags Details
AltSysRq[PWT] output from syslog (31.88 KB, text/plain)
2005-04-15 14:20 EDT, Hisashi T Fujinaka
no flags Details

  None (edit)
Description Hisashi T Fujinaka 2005-02-17 16:56:25 EST
Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Every time, with a particular database.

Steps to Reproduce:
1. Export data from Oracle 9 (our specific data set) to an Oracle 8 machine.
2. Import to Oracle 8.
3. Watch the system consume all memory/swap and become unresponsive.
Actual results:

The database imports and when integrity constraints are being applied, the kernel keeps 
allowing memory to be consumed until the system dies. This is on a patched RHEL 3 
system. I watched the free memory/swap go away with top.

Expected results:

I would expect overcommit_memory to be set, or something would stop the process from 
consuming ALL of memory.

This has worked in the past, I don't know what has changed in the data set, however, the 
kernel still should not allow oracle to keeps spawning threads to consume all of memory.

Additional info:

Unfortunately, the data set has patient information and we can not provide it.
Comment 1 Suzanne Hillman 2005-02-18 14:38:34 EST
Since this refers to a RHEL3 system, not a RHEL4 one, I'm modifying the version
number accordingly.
Comment 2 Larry Woodman 2005-03-01 09:15:10 EST
Please privide lots more information so I can start debugging this
problem: top output, processor type, exact kernel version string,
AltSysrq-M outputs, etc.

Larry Woodman
Comment 3 Hisashi T Fujinaka 2005-03-01 14:01:42 EST
Here is the uname output:
Linux beatle.verinform.com 2.4.21-27.0.2.EL #1 Wed Jan 12 23:46:37 EST 2005 i686 i686 
i386 GNU/Linux

And the AltSysrq-M output will be attached.

I'm unclear about what else you want. Do you want the top output before the system 
Comment 4 Hisashi T Fujinaka 2005-03-01 14:03:32 EST
Created attachment 111542 [details]
AltSysRqM output
Comment 5 Larry Woodman 2005-04-06 14:32:00 EDT
Hisashi, the AltSysrq-M doen not show any problems with memory.  Please get me
several AltSysrq-P outputs and one AltSysrq-W and one AltSysrq-T output when the
system is hung so I can see what is running on each CPU and what each process in
blocked on.

Thanks, Larry Woodman
Comment 6 Larry Woodman 2005-04-06 15:50:44 EDT
OK, I think I see the problem here:

On one CPU kswapd calls launder_page() which increments the page->count and
calls page_cache_release() with the zone->lru_lock held when that page is being
On another CPU if the process last process that maps that page calls exit,
page_cache_release()gets called for the same page.  If thats the last reference
to the page and it races with kswapd, launder_page() will call __free_pages_ok()
with the and zone->lru_lock held deadlock.

This patch fixes this problem:
--- linux-2.4.21/mm/vmscan.c.orig
+++ linux-2.4.21/mm/vmscan.c
@@ -315,7 +315,9 @@ int launder_page(zone_t * zone, int gfp_
 	if (cache_ratio(zone) > cache_limits.max && page_anon(page) &&
 			free_min(zone) < 0) {
 		add_page_to_active_list(page, INITIAL_AGE);
+		lru_unlock(zone);
+		lru_lock(zone);
 		return 0;
Comment 7 Hisashi T Fujinaka 2005-04-15 14:17:40 EDT
I have some more AltSysRq output, but now the process dies properly.

Now, if you can forward my errors to Oracle, somehow, since their imp triggered this kernel bug.
Comment 8 Hisashi T Fujinaka 2005-04-15 14:20:22 EDT
Created attachment 113241 [details]
AltSysRq[PWT] output from syslog
Comment 9 Larry Woodman 2005-04-15 15:06:43 EDT
Hisashi, from looking at this AltSysrq-M output it appears that the system hung
because 182589 pages of anonymous memory was VM_LOCK'd:

>>>aa:182592 ac:1665 id:54 il:0 ic:0 fr:636 

Is your application doing something that mlock()s memory or something???

Larry Woodman
Comment 10 Hisashi T Fujinaka 2005-04-15 15:10:31 EDT
Unfortunately, all I'm doing is running "imp" from Oracle We mere mortals aren't privvy to the 
inner workings of Oracle programs.
Comment 11 Larry Woodman 2005-04-15 15:20:47 EDT
Ah, wait!  You have no swap space free!!  Thats the problem!!!

>>>Free swap:            0kB 

Fix that and the problem will go away.

Larry Woodman
Comment 12 Hisashi T Fujinaka 2005-04-15 15:26:35 EDT
Please read the opening bug. Having all my swap consumed is what this bug is all about.
Comment 13 Larry Woodman 2005-04-15 15:29:55 EDT
OK, sorry.  What is /proc/sys/vm/overcommit_memory?

Comment 14 Hisashi T Fujinaka 2005-04-15 18:31:59 EDT
OK, in clarification: the latest Alt-SysRq info was from a PATCHED system, from the patch send by Larry 
via the web page. Larry's fix now causes the program to crash after consuming all memory, which is 
better than the old behavior which was a hang forever.
Comment 15 Hisashi T Fujinaka 2005-04-15 18:33:33 EDT
/proc/sys/vm/overcommit_memory is the default, 0.
Comment 16 Ernie Petrides 2005-04-22 20:41:17 EDT
A fix for this problem has just been committed to the RHEL3 U6
patch pool this evening (in kernel version 2.4.21-32.2.EL).
Comment 24 Red Hat Bugzilla 2005-09-28 10:47:17 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.