Bug 80279

Summary: ksoftirqd_CPU0 hits 100% when running iostat
Product: Red Hat Enterprise Linux 2.1 Reporter: Anthony Marusic <amarusic>
Component: kernelAssignee: Don Howard <dhoward>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: jeffrey.buchsbaum, nphilipp
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-07-26 18:12:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anthony Marusic 2002-12-23 22:08:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020918

Description of problem:


While running 9i RAC tests, we were monitoring overall performance, using the
top utility.  All 4 CPUs were evenly distributing the workload. This was evident
by the percentage of  user CPU time of all CPUs ranging between 85% and 100%. 
When we ran "iostat -x 3" to check disk IO performance, the process,
ksoftiqrd_CPU0 ran up to 100% system CPU time, and stood at 100% throughout the
rest of the test.  At this time, CPU1,2,and 3 went to 1-3% user and 1-3% system
CPU times.  The ksoftirqd_CPU0 process continued to exhibit the same results
when starting a second test.  This condition was only cleared, when restarting
the database.  In addition to this, many of the the counters for "iostat -x 3 "
(%util, avgqu-sz, avgrq-sz, svctm, etc.) seemed to display cumulative results,
not being able to clear themselves and give proper 3 second statistics.

Background Information: 
2-way, 2.8Ghz server with Hyper Threading turned on 
      * kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
      * sysstat 4.0.1 Release 2 
      * Oracle testing: 
      * Mixed read/write IO on 4 tablespaces, across 4 CPUs 
                Running 60 oracle processes 
                One Oracle Instance 

The 2 major problems are:

1) CPU0 ends up running at 100% system time, severely impacting performance

2) iostat does not seem to clean it's counters for the specified interval, or
ever for that matter.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Start 60 oracle processes, issuing continuous reads/writes to 4 tablespaces.
Oracle tests run multiple select & insert/update statements.
2.run iostat during tests
3.top will show CPU0 at 100% system util while CPU1 CPU2 & CPU3 are at .03%
    

Additional info:

kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
sysstat 4.0.1 Release 2

Comment 1 jeffrey.buchsbaum 2003-02-13 17:32:31 UTC
This is the same bug really as 83789.  I have the same problem with just one
telnet session running.

Dell 530 dual 2.4ghz
4gb ram ecc

Comment 2 Alan Cox 2003-06-05 14:51:14 UTC
Im unconvinced they are the same thing


Comment 3 edward dertouzas 2003-07-27 05:32:21 UTC
    6 root      34  19     0    0     0 SWN   0.0  0.0 319:40 ksoftirqd_CPU0
   10 root      15   0     0    0     0 SW    0.0  0.0  95:25 kswapd
   13 root      15   0     0    0     0 SW    0.0  0.0 168:12 bdflush

Linux 2.4.9-e.12enterprise #1 SMP Tue Feb 11 01:29:18 EST 2003 i686 unn

This happens in a production environment running Oracle. Every few days the 
system will become completely unresponsive except for a redimentary functioning 
of the TCP/IP stack. Connect() returns success but remote host will just idle 
from that point forward. Server is unresponsive on console until it either 
returns (anywhere between 5 - 45 minutes, usually after the oracle listener and 
db have died) or the host is manually powercycled.

Note: this is not connected with any orinico problems.

Comment 4 Nils Philippsen 2004-01-13 15:38:13 UTC
Does the problem still show with recent kernels?

Anyway, this sounds like a kernel/scheduling problem to me, even more
so because the problem shows not only when iostat runs.

Comment 5 Charlie Bennett 2004-09-23 19:29:43 UTC
handing off to the kernel group

Comment 6 Jason Baron 2004-09-23 19:46:23 UTC
this is an old one, i think we should start w/reproducing it on the
latest rhel2.1 kernel, e.49. thanks.

Comment 7 Don Howard 2006-03-16 21:57:13 UTC
This is a truely ancient report.  If there is no update here in the next two
weeks demonstrating this bug on a current 2.1 kernel, this ticket will be closed.