Bug 184434

Summary: Load/CPU Usage spikes due to kscand on Oracle nodes
Product: Red Hat Enterprise Linux 3 Reporter: John Blaut <john.blaut>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: petrides
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 18:46:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Blaut 2006-03-08 18:51:04 UTC
Description of problem:

Related to Bugzilla Bug 145950 รข high loads / high iowait / up 100% cpu time 
for kscand on oracle box:

We have 4 Oracle RAC cluster nodes running RHEL3 Update 6 x86, each having 8GB 
RAM and SGA is currently set to 1.7GB. The problem we are having is that we 
get periodic CPU spikes every 5 minutes where the kscand process take up to 
100% of the CPU usage and the load shoots up to a value between 50 - 90.

We are currently running Kernel 2.4.21-37.ELsmp which according to advisory 
RHSA-2005-663 addresses the problem we are having by supporting:

- support for new "oom-kill" and "kscand_work_percent" sysctls  

We would like to know exactly how best to set the 
tunable "/proc/sys/vm/kscand_work_percent" to solve/alleviate the problem we 
presently have. We are running Oracle without hugepages. Currently it is set 
to the default setting: 100. Please provide recommendations for this value and 
instructions for how to set it.

Also please indicate if there is anything new/other than the above that can be 
done to address this problem.


Version-Release number of selected component (if applicable):
2.4.21-37.ELsmp 

How reproducible:
It's ongoing. Any recommendation for how to possibly reproduce/further debug 
the problem?

Steps to Reproduce:
When the servers were idle we did not notice this problem. However when me 
moved the servers into production and started getting load, the problem became 
apparanet right away.

Comment 1 Larry Woodman 2006-07-28 19:25:49 UTC
Please set /proc/sys/vm/kscand_work_percent to 100 and leave the oom-kill
tunable with the default value.  Please let me know if this solves your problem.

Thanks, Larry Woodman


Comment 3 RHEL Program Management 2007-10-19 18:46:38 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.