Bug 57649 - Process keeps moving between CPUs on a dual processor system
Process keeps moving between CPUs on a dual processor system
Status: CLOSED WONTFIX
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
athlon Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-12-18 05:01 EST by Jeremy Sanders
Modified: 2007-04-18 12:38 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-07 19:51:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeremy Sanders 2001-12-18 05:01:08 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

Description of problem:
We've got someone running a job on a dual athlon system (MP), and the
single process which is running keeps moving between the CPUs (See top dump
below). The process is doing virtually no system calls, and is purely
computational.


Version-Release number of selected component (if applicable):


How reproducible:
Didn't try

Steps to Reproduce:
Haven't had a chance to reproduce it, as the job takes 1 week to run so we
can't stop it! Any ideas for debugging it?


Additional info:

Mem:  1027952K av,  293540K used,  734412K free,       0K shrd,   75128K buff
  9:52am  up 8 days, 43 min,  2 users,  load average: 1.00, 1.00, 1.00
63 processes: 61 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states: 59.0% user,  0.0% system, 59.0% nice, 40.0% idle
CPU1 states: 42.0% user,  0.0% system, 42.0% nice, 57.0% idle
Mem:  1027952K av,  294412K used,  733540K free,       0K shrd,   75136K buff
Swap: 2048276K av,       0K used, 2048276K free                   89692K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  497 drb       19  19 10868  10M   552 R N  99.9  1.0 186:07 iladiskp.pc
  745 root       9   0  1216 1212   972 R     0.9  0.1   0:00 top
    1 root       8   0   528  528   460 S     0.0  0.0   0:04 init
    2 root       9   0     0    0     0 SW    0.0  0.0   0:00 keventd
    3 root      19  19     0    0     0 SWN   0.0  0.0   0:00 ksoftirqd_CPU0
    4 root      18  19     0    0     0 SWN   0.0  0.0   0:00 ksoftirqd_CPU1


/proc/cpuinfo:
processor
: 0
vendor_id
: AuthenticAMD
cpu family	: 6
model
	: 6
model name	: AMD Athlon(tm) MP Processor 1800+
stepping
: 2
cpu MHz		: 1533.404
cache size	: 256 KB
fdiv_bug
: no
hlt_bug
	: no
f00f_bug
: no
coma_bug
: no
fpu
	: yes
fpu_exception
: yes
cpuid level	: 1
wp
	: yes
flags
	: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips
: 3060.53

processor
: 1
vendor_id
: AuthenticAMD
cpu family	: 6
model
	: 6
model name	: AMD Athlon(tm) Processor
stepping
: 2
cpu MHz		: 1533.404
cache size	: 256 KB
fdiv_bug
: no
hlt_bug
	: no
f00f_bug
: no
coma_bug
: no
fpu
	: yes
fpu_exception
: yes
cpuid level	: 1
wp
	: yes
flags
	: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips
: 3060.53

/proc/interrupts:
           CPU0       CPU1       
  0:   34910573   34513974    IO-APIC-edge  timer
  1:          2          1    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          0          1    IO-APIC-edge  rtc
 11:    1214300    1205701   IO-APIC-level  eth0
 14:      41246      46624    IO-APIC-edge  ide0
 15:          2          0    IO-APIC-edge  ide1
NMI:          0          0 
LOC:   69423658   69423650 
ERR:          0
MIS:          0
Comment 1 Arjan van de Ven 2001-12-18 06:21:36 EST
There's 2 things that play together here:
* The heisenberg effect of running "top" (the measurement/display tool needs cpu
time and can bump your task to another cpu)
* The scheduler had a bug where it would too easily bump tasks to other cpus;
  there's an updated kernel at http://people.redhat.com/arjanv/testkernels which
  should have an improvement; however it's still not good enough to our taste and
  Ingo Molnar keeps working on improving it. 

I understand it's not easy to test the new kernel but if you get an opportunity
it would be interesting to see if it helps for your case
Comment 2 Alan Cox 2003-06-07 19:51:49 EDT
No reply in a year

Note You need to log in before you can comment on or make changes to this bug.