Red Hat Bugzilla – Bug 141476
oracle DB creates very high system load
Last modified: 2007-11-30 17:07:05 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Opera
Description of problem:
After changing a Oracle DB from a 4 way IBM x445 (kernel 2.4.21-15
to a 8 way IBM x445 (kernel 2.4.21-15.0.4.ELsmp U3) there is a
strange behaviour on the new Server.
After starting up the oracle DB and our application
we get a very high load, and about 90% of the load is system load.
CPU %user %nice %system %idle
07:00:00 AM all 5.60 0.00 94.35 0.05
07:10:00 AM all 6.51 0.00 93.47 0.02
07:20:00 AM all 6.17 0.00 93.79 0.04
07:30:00 AM all 6.89 0.00 93.01 0.10
07:40:00 AM all 6.01 0.00 93.97 0.02
07:50:00 AM all 7.11 0.00 78.55 14.33
08:00:00 AM all 6.96 0.00 90.58 2.46
08:10:00 AM all 11.08 0.00 65.54 23.38
08:20:01 AM all 9.40 0.00 89.01 1.59
08:30:00 AM all 7.42 0.00 64.55 28.03
08:40:00 AM all 9.70 0.00 82.07 8.23
08:50:00 AM all 8.80 0.00 72.56 18.64
09:00:01 AM all 9.81 0.00 84.12 6.07
09:10:00 AM all 11.11 0.00 71.99 16.90
09:20:00 AM all 10.51 0.00 82.38 7.11
We can`t see this behaviour on the old server.
Is there a known Problem with oracle and this kernel, are there
special kernel parameters we can apply ?
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. start oracle
Actual Results: high load, can`t use system
Expected Results: normal behaviour of the oracle DB
Christoph, I cant reproduce this problem internally because I dont
have that hardware. Can you profile the kernel for me?
1. Enable kernel profiling by turning on nmi_watchdog and allocating
the kernel profile buffer.
For example, add the following two items to the "kernel" line of
as in the following example:
kernel /vmlinuz-2.4.21-15.0.4.ELsmp ro profile=2 nmi_watchdog=1
2. Create a shell script containing the following lines:
while /bin/true; do
/usr/sbin/readprofile -v | sort -nr +2 | head -15
3. I also need you to get me several AltSysrq-W outputs when the
system is in this state. This will allow me to se what every CPU is
doing when its in system mode.
Thanks for your help, Larry Woodman
Christoph, any more data on this bug? I suspect that the problem is
kswapd spending a bunch of time in prune_icache() but I cant tell
without the profiles and/or AltSysrq stats.
Anyway, the latest RHEL3-U4 kernel is located here, please try it:
i`m sorry, but i think i can`t provide you the information you need.
Our Oracle DB Server is a live system. It works in a big PACS System,
and we have about 2500+ clients (in our Hospital), who are using the
x-Ray Archive. So i can`t shut down and restart and try a little bit,
as you can imagine. This is a 24/7 system. So i`m sorry, also we
don`t have a testsystem where we can try the whole behavior of the 8
way system. My next try will be on Saturday, there i have a window
where i can mount the DB on the new server once again.
What i did is to setup the 8way machine the same way
(kernel, qla driver, rpm Packages,..) like our old 4 way IBM x445.
So normally this should work.
Is the behaviour of the kernel very different from a 4way Xeon DP to
a 8way Xeon MP ??? Could the problem a hardware related problem be ??
It's also clear, that without Information from my side, you can`t
solve this problem. But when the DB is not working, we will setup a
testsystem, because we have do use the new server, as the old one is
getting to slow. But it`s also not very easy to simulate our live
Thanks in advance
Christoph, I'm not sure what problem you are running into here.
Perhaps increasing the number of CPUs from 4 to 8 is causing an
excessive spinlock contention problem. BTW, you can gather AltSysrq
stats without rebooting or disturbing the system:
1.) echo 1 > /proc/sys/kernel/sysrq
2.) echo w > /proc/sysrq-trigger
3.) the active tracebacks for all cpus will available via dmesg
We really cant debug this problem without this minimum data.
o.k. i will migrate the DB on Saturday to the new 8 way Xeon MP
machine (IBM x445 Type 8870-42x). The only problem i have, is, if it's
not a problem of the kernel (2.4.21-15.03) or U3, i have to mÃgrate
to the old server as soon as possible, so that the system will
But i think i can try 15 minutes or so to get some data for you.
BTW i`m using HT so Linux sees a total of 16 CPU`s. Is this to much ?
Should i turn it off ?!
is an 8-way x445 a NUMA system, by chance ?
If so, the reason for slower performance is that half the memory is on
the other NUMA node, and slower by a large factor (like 8x slower,
IIRC). This means that for an application like Oracle, where lots of
the data is shared, half of the memory accesses will be 8x slower than
before, for an average access speed of 4.5x slower than without NUMA.
Increasing the number of CPUs doesn't help if the workload is not CPU
bound, but memory bound.
Of course, I do not know if the x445 works like this, I'm just
guessing based on how some of IBM's older x4xx series worked...
like i know, the 8 way x445 is not a NUMA system.
it depends on a x-architecture, and it is certified for RHEL3.
from the IBM Webside :
#The high-performance solution for ERP, CRM, database, and server
consolidation for your enterprise datacenter
#Support for up to 8-way scalability with Red Hat Enterprise Linux
So i think, once again, for me it seems to be a kernel problem.
For what its worth, most of that is IBM hype. When we tested the
x445, it worked fairly well as a 'consolidation' box, i.e. 2
processes that want to run, each consuming 4 CPU's. However, a true
SMP process had identical performance on an IBM 4 way and and IBM 8-
way (with the Summit/NUMA changes). If you press IBM on this they'll
generally admit that. Having said that, most Oracle processes do
scale at least partitially on the x445 (we saw about 40-60%
scalability over a 4 way) as it can support multiple reader/writer
threads which will balance themselves across the quad cpu groups.
However, in a happy server, all the time shouldn't be spent in
system. Could easily be tweaking one of the U2 or U3 SCSI/VM bugs.
Most of these things were resolved by the -25 (U4 beta series) in our
Just to clarify the state of this bug, Larry is waiting on data to
be provided by Ing. Christoph Pirchl (to be gathered this weekend),
so I'm putting this bug back into NEEDINFO state.
Yes i will try to get some information on Saturday,
also i`m still waiting for a recommomdation on the Hyper Threading.
Should i turn it off, or doesn`t it matter if i use 8 or 8 with HT
enabled processor ?
Thanks in advance
I migrated the server from the old 4-way x445 to the new 8-way x445
(the same rpm`s, kernel and drivers installed).
We have the same behaviour of the server like the last time.
I can see a lot of running processes > 35, and a lot of system load
> 90% (see vmstat and sar from the Attachment).
thanks in advance
Created attachment 108388 [details]
Sysinfos from the x445-8 way Xeon MP server
sysrq, vmstat and sar infos
Send you the information on Saturday, 11.12., have you looked at the
sysrq and sar and vmstat logs ? Haven`t heard anything since Saturday.
Do you have any new information for me.
what`s going on, heard nothing since 2 weeks ?
hristoph, sorry for the delay most of Red Hat was on vacation over the
The problem you are having here is a scalability issue with Oracle
running on a 16-way rather than an 8-way because each processor is
dual hyper-threaded. What appears to be happening is that the Oracle
processes are threads rather than different processes. Every Oracle
thread is in either mmmap or munmap and those routines acquire the
mm->page_table_lock. Since the struct_mm is shared between all
threads of a multi-threaded process only one can acquire the lock and
the other 15 threads(there one on each of the 16 processors),
15/16(94%) of the total cpu time is consumed trying to acquire that
mm->page_table_lock while the other 1/15(6%) of the total cpu time is
doing real work with that lock held.
The first thing you need to do is see if Oracle can be tuned to 1.)
limit the number of threads per process and 2.) limit the number of
Oracle instances that can be doing whatever require the mapping and
unmapping to be taking place at the same time. The problem is likely
pathelogical; the more threads running on processors you have in this
state the more cpu time will be consumed spinning on this lock.
Can you look into these Oracle parameters?
DeÃ¡r Mr. Woodman,
i opened a TAR at Oracle Metalink, but dit not ger a reaction till
now from Oracle. I will post th result from Oracle, when they update
Did you have any problems with applications and 16 way Machines
like i have, for instance also with oracle or other applications ?
CHristoPh, I dont have access to a 16-way. My assessment based on
looking at the AltSysrq-W output and noticing that one cpu had the
pagetable lock and all other 15 were spining on that lock.
Hate to do this to you, but:
Oracle is a process based model rather than a thread based model (as it is on
Windows). The exception to this is the SGA. I would say that this smacks of
an Oracle config problem more than Linux. However, someone did see the same
problem on the x445 back in feb (https://www.redhat.com/archives/taroon-
list/2004-February/msg00161.html). They ended up increasing their parallelism
within Oracle I suspect that his remaining perf problems are related to the max
readahead settings that changed between AS2.1 and AS3 (which kill db reads).
Lastly, not sure if you use any of the SGA increasing tricks (shmfs/ramfs/etc),
but try it with a 'normal', i.e. less than 2GB SGA setting to see how it
works. We've had nothing but problems with those and the extreme NUMA nature
of the x445 can only make that worse.
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.