From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MyIE2; .NET CLR 1.1.4322) Description of problem: I have same problem like in a bug 100680. This occured after an installation of new patch on Oracle 9.2: from 9.2.0.4 to 9.2.0.5. I didn't have this problem before installation of this patch. I didn't do any other changes. I have computer with two P3 Xeon CPU, with 2 Gb RAM and with 3 SCSI HDD. I have set up "high" level of this problem because it is my production system. Version-Release number of selected component (if applicable): kernel-2.4.21-4.ELsmp How reproducible: Didn't try Steps to Reproduce: Install patch 9.2.0.4 -> 9.2.0.5 on installed Oracle Additional info:
Exactly how much CPU is kscand using? Does it use that much CPU all the time, or does the CPU use come in load spikes? Could you send us 30 seconds of output from 'vmstat 1' at a period when the problem occurs ?
There was a bug found and fixed revently in the page_referenced() function that inadvertantly caused pagecache pages to be aged upward when they shouldnt have been. This bug combined with the relatively large pagecache of an Oracle database can certainly cause kscand to run more than it should. We could get you a test kernel with this bug fix to determine if this is the cause of your problem. However, can you tell us what the content of the Oracle 9.2.0.5 patch was or should we contact Oracle about that? Larry Woodman
Created attachment 99992 [details] vmstat output
kscand uses 3%-90% on both CPU. In some time it uses more than 90% of CPU! It occurs only if Oracle is running.
Where I can take this test kernel? I have downloaded file ftp://ftp.redhat.com/pub/redhat/linux/updates/enterprise/3AS/en/os/SRP MS/kernel-2.4.21-9.0.3.EL.src.rpm. I going to try this kernel.
Created attachment 99993 [details] Oracle9i Database Server Patch Set Notes Oracle9i Database Server Patch Set Notes Release 2 Patch Set 4 Version 9.2.0.5.0 for Linux x86
Can you also include a quick "top" output at the same time so we can see what else is running? Larry
Created attachment 101247 [details] Output of vmstat after setting vm parameters This is the output of vmstat, showing several occurrences of the jump in load caused by the kscand daemon. After using the values given by the Red Hat Support team. I fear to say that those changes didn't help, as the system continues to misbehave. I am willing to provide data to help to solve this issue. Thank you. Regards, Antonio
Our system is a Dell 6650 with 4x 2,5 GHz Xeon, 8 GBytes memory, PERC4/DC with 2 36GB 15k RPM drives. QLogic fibre channel, Hitachi storage. Regards, AC
Created attachment 101248 [details] Top q b output This top output shows the problem area, where cpu sys reaches 100%, I/O almost stops and kscand is the top of cpu consummers. The load jumps from normal 4-5 to 30-50 in a step, and about a minute more the load is 4-5 again. Regards, AC
Test, to add my address to the Cc list. Sorry. AC
Hello all, is this bug still active? We are experiencing this jumpy load behaviour with 2.4.21-15EL SMP here. I will supply any data needed to help debugging. Thanks, Antonio
I need to know if this is still a problem with the latest RHEL3-U4 kernel. We fixed a couple problems in kscand and need to know if they fixed this problem and this bug can bel closed. Larry Woodman
Hello, I would like to know if the problem solution is just addressed with kernel 2.6.xxx (RHEL-U4) or has any fix to 2.4.xxx kernels (RHEL-U2 or RHEL-U3)? Thanks, Eduardo Dias
Hi there, anyone using Dell computers with the new kernel? I have two PowerEdge 6650 here and had to give up on using clumanager because of the outages created by the softdog when the kscand daemon frozen the machines, which induced undesired cluster's node switching. Still no chance to test the new kernel... Thank you, Antonio
we also are also experiencing this problem when using the progress database kscand and kswapd use ALL the cpu freezing the machine
This bug should be fixed from RHEL3 U4 onwards. If kscand is taking too much CPU, you can reduce the kscand scan percentage (in /proc/sys/vm) to something like 10%.
Hi there. We are using the following values. Whith this, the kscand behaviour is tamed, as it only wake up from time to time and takes some seconds (1 to 3s) at once from the processors. We have an Oracle database, two instances, 900MB SGA, 1000-1200 processes/users. Last login: Thu Nov 10 08:27:39 2005 from gsisssr_alfcruz.hcpa [root@vega root]# cat /proc/sys/vm/ bdflush max_map_count pagecache dcache_priority max-readahead page-cluster hugetlb_pool min-readahead pagetable_cache inactive_clean_percent overcommit_memory stack_defer_threshold kswapd overcommit_ratio [root@vega root]# cat /proc/sys/vm/* 30 500 0 0 500 3000 70 50 0 0 900 30 256 32 8 65536 16 3 0 50 10 75 30 3 25 50 2048 [root@vega root]# We are going to try the U4 release when possible. Thank you for the feedback. Regards, Antonio
(In reply to comment #17) unfortunately we're already using 3.4-2 please could you expand on your /proc/sys/vm suggestion ? # cat /proc/sys/vm/kswapd 512 32 8
I've just discovered that RHEL3 U6 addresses a similar issue http://rhn.redhat.com/errata/RHSA-2005-663.html support for new "oom-kill" and "kscand_work_percent" sysctls https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=145950
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.