Bug 80279 - ksoftirqd_CPU0 hits 100% when running iostat
ksoftirqd_CPU0 hits 100% when running iostat
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Don Howard
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2002-12-23 17:08 EST by Anthony Marusic
Modified: 2007-11-30 17:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-07-26 14:12:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Anthony Marusic 2002-12-23 17:08:21 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020918

Description of problem:

While running 9i RAC tests, we were monitoring overall performance, using the
top utility.  All 4 CPUs were evenly distributing the workload. This was evident
by the percentage of  user CPU time of all CPUs ranging between 85% and 100%. 
When we ran "iostat -x 3" to check disk IO performance, the process,
ksoftiqrd_CPU0 ran up to 100% system CPU time, and stood at 100% throughout the
rest of the test.  At this time, CPU1,2,and 3 went to 1-3% user and 1-3% system
CPU times.  The ksoftirqd_CPU0 process continued to exhibit the same results
when starting a second test.  This condition was only cleared, when restarting
the database.  In addition to this, many of the the counters for "iostat -x 3 "
(%util, avgqu-sz, avgrq-sz, svctm, etc.) seemed to display cumulative results,
not being able to clear themselves and give proper 3 second statistics.

Background Information: 
2-way, 2.8Ghz server with Hyper Threading turned on 
      * kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
      * sysstat 4.0.1 Release 2 
      * Oracle testing: 
      * Mixed read/write IO on 4 tablespaces, across 4 CPUs 
                Running 60 oracle processes 
                One Oracle Instance 

The 2 major problems are:

1) CPU0 ends up running at 100% system time, severely impacting performance

2) iostat does not seem to clean it's counters for the specified interval, or
ever for that matter.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Start 60 oracle processes, issuing continuous reads/writes to 4 tablespaces.
Oracle tests run multiple select & insert/update statements.
2.run iostat during tests
3.top will show CPU0 at 100% system util while CPU1 CPU2 & CPU3 are at .03%

Additional info:

kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
sysstat 4.0.1 Release 2
Comment 1 jeffrey.buchsbaum 2003-02-13 12:32:31 EST
This is the same bug really as 83789.  I have the same problem with just one
telnet session running.

Dell 530 dual 2.4ghz
4gb ram ecc
Comment 2 Alan Cox 2003-06-05 10:51:14 EDT
Im unconvinced they are the same thing
Comment 3 edward dertouzas 2003-07-27 01:32:21 EDT
    6 root      34  19     0    0     0 SWN   0.0  0.0 319:40 ksoftirqd_CPU0
   10 root      15   0     0    0     0 SW    0.0  0.0  95:25 kswapd
   13 root      15   0     0    0     0 SW    0.0  0.0 168:12 bdflush

Linux 2.4.9-e.12enterprise #1 SMP Tue Feb 11 01:29:18 EST 2003 i686 unn

This happens in a production environment running Oracle. Every few days the 
system will become completely unresponsive except for a redimentary functioning 
of the TCP/IP stack. Connect() returns success but remote host will just idle 
from that point forward. Server is unresponsive on console until it either 
returns (anywhere between 5 - 45 minutes, usually after the oracle listener and 
db have died) or the host is manually powercycled.

Note: this is not connected with any orinico problems.
Comment 4 Nils Philippsen 2004-01-13 10:38:13 EST
Does the problem still show with recent kernels?

Anyway, this sounds like a kernel/scheduling problem to me, even more
so because the problem shows not only when iostat runs.
Comment 5 Charlie Bennett 2004-09-23 15:29:43 EDT
handing off to the kernel group
Comment 6 Jason Baron 2004-09-23 15:46:23 EDT
this is an old one, i think we should start w/reproducing it on the
latest rhel2.1 kernel, e.49. thanks.
Comment 7 Don Howard 2006-03-16 16:57:13 EST
This is a truely ancient report.  If there is no update here in the next two
weeks demonstrating this bug on a current 2.1 kernel, this ticket will be closed.

Note You need to log in before you can comment on or make changes to this bug.