Bug 141476

Summary: oracle DB creates very high system load
Product: Red Hat Enterprise Linux 3 Reporter: Ing. Christoph Pirchl <christoph.pirchl>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel, steve.russell
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:12:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Sysinfos from the x445-8 way Xeon MP server none

Description Ing. Christoph Pirchl 2004-12-01 13:51:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Opera 
7.54  [de]

Description of problem:
After changing a Oracle DB from a 4 way IBM x445 (kernel 2.4.21-15 
U2)
to a 8 way IBM x445 (kernel 2.4.21-15.0.4.ELsmp U3) there is a 
strange behaviour on the new Server. 

After starting up the oracle DB and our application
we get a very high load, and about 90% of the load is system load.

           CPU     %user     %nice   %system     %idle
07:00:00 AM       all      5.60      0.00     94.35      0.05
07:10:00 AM       all      6.51      0.00     93.47      0.02
07:20:00 AM       all      6.17      0.00     93.79      0.04
07:30:00 AM       all      6.89      0.00     93.01      0.10
07:40:00 AM       all      6.01      0.00     93.97      0.02
07:50:00 AM       all      7.11      0.00     78.55     14.33
08:00:00 AM       all      6.96      0.00     90.58      2.46
08:10:00 AM       all     11.08      0.00     65.54     23.38
08:20:01 AM       all      9.40      0.00     89.01      1.59
08:30:00 AM       all      7.42      0.00     64.55     28.03
08:40:00 AM       all      9.70      0.00     82.07      8.23
08:50:00 AM       all      8.80      0.00     72.56     18.64
09:00:01 AM       all      9.81      0.00     84.12      6.07
09:10:00 AM       all     11.11      0.00     71.99     16.90
09:20:00 AM       all     10.51      0.00     82.38      7.11

We can`t see this behaviour on the old server.

Is there a known Problem with oracle and this kernel, are there 
special kernel parameters we can apply ?

Version-Release number of selected component (if applicable):
kernel-2.4.21-15.0.4.ELsmp

How reproducible:
Always

Steps to Reproduce:
1. start oracle

2.
3.
    

Actual Results:  high load, can`t use system

Expected Results:  normal behaviour of the oracle DB

Additional info:

Comment 1 Larry Woodman 2004-12-03 12:29:53 UTC
Christoph, I cant reproduce this problem internally because I dont
have that hardware.  Can you profile the kernel for me?

1. Enable kernel profiling by turning on nmi_watchdog and allocating
the kernel profile buffer.
   For example, add the following two items to the "kernel" line of
/boot/grub/grub.conf:
                                                                     
                                        
     profile=2 nmi_watchdog=1
                                                                     
                                        
   as in the following example:
                                                                     
                                        
     kernel /vmlinuz-2.4.21-15.0.4.ELsmp ro profile=2 nmi_watchdog=1
root=0805
                      
   Then reboot.
                                                                     
                                        
2. Create a shell script containing the following lines:
                                                                     
                                        
#!/bin/sh
while /bin/true; do
echo;date
/usr/sbin/readprofile -v | sort -nr +2 | head -15
/usr/sbin/readprofile -r
sleep 5
done


3. I also need you to get me several AltSysrq-W outputs when the
system is in this state.  This will allow me to se what every CPU is
doing when its in system mode.

Thanks for your help, Larry Woodman 

Comment 2 Larry Woodman 2004-12-07 15:18:52 UTC
Christoph, any more data on this bug?  I suspect that the problem is
kswapd spending a bunch of time in prune_icache() but I cant tell
without the profiles and/or AltSysrq stats. 

Anyway, the latest RHEL3-U4 kernel is located here, please try it:

ftp://partners.redhat.com/a61d109e2483b0bf579b0b5f90a5ea8c/2.4.21-27.EL/


Larry Woodman

Comment 3 Ing. Christoph Pirchl 2004-12-07 16:43:54 UTC
Dear Larry,

i`m sorry, but i think i can`t provide you the information you need.
Our Oracle DB Server is a live system. It works in a big PACS System,
and we have about 2500+ clients (in our Hospital), who are using the 
x-Ray Archive. So i can`t shut down and restart and try a little bit,
as you can imagine. This is a 24/7 system. So i`m sorry, also we 
don`t have a testsystem where we can try the whole behavior of the 8 
way system. My next try will be on Saturday, there i have a window 
where i can mount the DB on the new server once again.

What i did is to setup the 8way machine the same way
(kernel, qla driver, rpm Packages,..) like our old 4 way IBM x445.
So normally this should work.

Is the behaviour of the kernel very different from a 4way Xeon DP to 
a 8way Xeon MP ??? Could the problem a hardware related problem be ??

It's also clear, that without Information from my side, you can`t 
solve this problem. But when the DB is not working, we will setup a 
testsystem, because we have do use the new server, as the old one is 
getting to slow. But it`s also not very easy to simulate our live 
system !!??

Thanks in advance

CHristoPh

Comment 4 Larry Woodman 2004-12-07 16:56:43 UTC
Christoph, I'm not sure what problem you are running into here. 
Perhaps increasing the number of CPUs from 4 to 8 is causing an
excessive spinlock contention problem.  BTW, you can gather AltSysrq
stats without rebooting or disturbing the system: 

1.) echo 1 > /proc/sys/kernel/sysrq
2.) echo w > /proc/sysrq-trigger
3.) the active tracebacks for all cpus will available via dmesg

We really cant debug this problem without this minimum data.


Larry Woodman




Comment 5 Ing. Christoph Pirchl 2004-12-07 20:15:05 UTC
o.k. i will migrate the DB on Saturday to the new 8 way Xeon MP 
machine (IBM x445 Type 8870-42x). The only problem i have, is, if it's
not a problem of the kernel (2.4.21-15.03) or U3, i have to mígrate 
to the old server as soon as possible, so that the system will 
running again.

But i think i can try 15 minutes or so to get some data for you.

BTW i`m using HT so Linux sees a total of 16 CPU`s. Is this to much ?
Should i turn it off ?!

Thanks

CHristoPh Pirchl

Comment 6 Rik van Riel 2004-12-08 02:22:43 UTC
Christoph,

is an 8-way x445 a NUMA system, by chance ?

If so, the reason for slower performance is that half the memory is on
the other NUMA node, and slower by a large factor (like 8x slower,
IIRC). This means that for an application like Oracle, where lots of
the data is shared, half of the memory accesses will be 8x slower than
before, for an average access speed of 4.5x slower than without NUMA.

Increasing the number of CPUs doesn't help if the workload is not CPU
bound, but memory bound.

Of course, I do not know if the x445 works like this, I'm just
guessing based on how some of IBM's older x4xx series worked...

Comment 7 Ing. Christoph Pirchl 2004-12-08 22:21:37 UTC
Rik,

like i know, the 8 way x445 is not a NUMA system.

it depends on a x-architecture, and it is certified for RHEL3.

from the IBM Webside :

#The high-performance solution for ERP, CRM, database, and server 
consolidation for your enterprise datacenter
#Support for up to 8-way scalability with Red Hat Enterprise Linux

So i think, once again, for me it seems to be a kernel problem.


Comment 8 Steve Russell 2004-12-09 14:10:46 UTC
For what its worth, most of that is IBM hype.  When we tested the 
x445, it worked fairly well as a 'consolidation' box, i.e. 2 
processes that want to run, each consuming 4 CPU's.  However, a true 
SMP process had identical performance on an IBM 4 way and and IBM 8-
way (with the Summit/NUMA changes).  If you press IBM on this they'll 
generally admit that.  Having said that, most Oracle processes do 
scale at least partitially on the x445 (we saw about 40-60% 
scalability over a 4 way) as it can support multiple reader/writer 
threads which will balance themselves across the quad cpu groups.  

However, in a happy server, all the time shouldn't be spent in 
system.  Could easily be tweaking one of the U2 or U3 SCSI/VM bugs.  
Most of these things were resolved by the -25 (U4 beta series) in our 
testing.

Comment 9 Ernie Petrides 2004-12-09 21:02:24 UTC
Just to clarify the state of this bug, Larry is waiting on data to
be provided by Ing. Christoph Pirchl (to be gathered this weekend),
so I'm putting this bug back into NEEDINFO state.

Comment 10 Ing. Christoph Pirchl 2004-12-09 21:14:12 UTC
Yes i will try to get some information on Saturday,
also i`m still waiting for a recommomdation on the Hyper Threading.

Should i turn it off, or doesn`t it matter if i use 8 or 8 with HT
enabled processor ?

Thanks in advance

CHristoPh


Comment 11 Ing. Christoph Pirchl 2004-12-11 20:39:08 UTC
I migrated the server from the old 4-way x445 to the new 8-way x445 
(the same rpm`s, kernel and drivers installed).

We have the same behaviour of the server like the last time.

I can see a lot of running processes > 35, and a lot of system load
> 90% (see vmstat and sar from the Attachment).

thanks in advance

CHristoPh

Comment 12 Ing. Christoph Pirchl 2004-12-11 20:41:04 UTC
Created attachment 108388 [details]
Sysinfos from the x445-8 way Xeon MP server

sysrq, vmstat and sar infos

Comment 13 Ing. Christoph Pirchl 2004-12-15 17:43:15 UTC
Dear Sirs,

Send you the information on Saturday, 11.12., have you looked at the 
sysrq and sar and vmstat logs ? Haven`t heard anything since Saturday.

Do you have any new information for me.

thanks

CHristoPh Pirchl

Comment 14 Ing. Christoph Pirchl 2004-12-29 15:22:47 UTC
Dear Sirs,

what`s going on, heard nothing since 2 weeks ?

thanks

CHristoPh Pirchl

Comment 15 Larry Woodman 2005-01-06 18:29:35 UTC
hristoph, sorry for the delay most of Red Hat was on vacation over the
holidays.

The problem you are having here is a scalability issue with Oracle
running on a 16-way rather than an 8-way because each processor is
dual hyper-threaded.  What appears to be happening is that the Oracle
processes are threads rather than different processes.  Every Oracle
thread is in either mmmap or munmap and those routines acquire the
mm->page_table_lock.  Since the struct_mm is shared between all
threads of a multi-threaded process only one can acquire the lock and
the other 15 threads(there one on each of the 16 processors),
15/16(94%) of the total cpu time is consumed trying to acquire that
mm->page_table_lock while the other 1/15(6%) of the total cpu time is
doing real work with that lock held.

The first thing you need to do is see if Oracle can be tuned to 1.)
limit the number of threads per process and 2.) limit the number of
Oracle instances that can be doing whatever require the mapping and
unmapping to be taking place at the same time.  The problem is likely
pathelogical; the more threads running on processors you have in this
state the more cpu time will be consumed spinning on this lock.

Can you look into these Oracle parameters?

Larry Woodman

Comment 16 Ing. Christoph Pirchl 2005-01-20 20:58:48 UTC
Deár Mr. Woodman,

i opened a TAR at Oracle Metalink, but dit not ger a reaction till 
now from Oracle. I will post th result from Oracle, when they update 
the TAR.

Did you have any problems with applications and 16 way Machines
like i have, for instance also with oracle or other applications ?

Thanks

CHristoPh

Comment 17 Larry Woodman 2005-01-20 21:38:40 UTC
CHristoPh, I dont have access to a 16-way.  My assessment based on
looking at the AltSysrq-W output and noticing that one cpu had the
pagetable lock and all other 15 were spining on that lock.

Larry Woodman


Comment 18 Steve Russell 2005-03-06 18:50:04 UTC
Hate to do this to you, but:
http://www.redhat.com/whitepapers/rhel/OracleonLinux.pdf

Oracle is a process based model rather than a thread based model (as it is on 
Windows).  The exception to this is the SGA.  I would say that this smacks of 
an Oracle config problem more than Linux.  However, someone did see the same 
problem on the x445 back in feb (https://www.redhat.com/archives/taroon-
list/2004-February/msg00161.html).  They ended up increasing their parallelism 
within Oracle I suspect that his remaining perf problems are related to the max 
readahead settings that changed between AS2.1 and AS3 (which kill db reads).  

Lastly, not sure if you use any of the SGA increasing tricks (shmfs/ramfs/etc), 
but try it with a 'normal', i.e. less than 2GB SGA setting to see how it 
works.  We've had nothing but problems with those and the extreme NUMA nature 
of the x445 can only make that worse.  

Comment 19 RHEL Program Management 2007-10-19 19:12:32 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.