Bug 77652 - ksoftirqd kernel load issue - cpu load saturated by ksoftirqd
ksoftirqd kernel load issue - cpu load saturated by ksoftirqd
Status: CLOSED DUPLICATE of bug 73733
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
8.0
i686 Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-11-11 10:06 EST by jeffrey.buchsbaum
Modified: 2006-02-21 13:50 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-21 13:50:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description jeffrey.buchsbaum 2002-11-11 10:06:27 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; Linux i686; U;) Gecko/20020830

Description of problem:
ksoftirqd takes over the whole machine, filling up each cpu with load....it
happens after being idle overnight, and can only be fixed with a reboot. It can
skip nights. Machine is a dual Xeon 2.4, 4gb ram, dual 120gb disks, nvidia
quaddro4900xgl video system.


Here is a paste from "top":

[jbuchsba@coil ~]$ top

  9:00am  up 4 days, 14:44,  1 user,  load average: 17.51, 19.60, 19.82
207 processes: 192 sleeping, 2 running, 13 zombie, 0 stopped
CPU0 states:  0.3% user, 56.2% system,  0.0% nice, 43.0% idle
CPU1 states:  0.5% user, 54.1% system,  0.0% nice, 44.5% idle
CPU2 states:  0.2% user, 68.0% system,  0.0% nice, 31.3% idle
CPU3 states:  0.3% user, 71.2% system,  0.0% nice, 27.5% idle
Mem:  3356868K av, 2139256K used, 1217612K free,       0K shrd,  295188K buff
Swap: 2097136K av,       0K used, 2097136K free                 1423700K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
    9 root      39  19     0    0     0 SWN  34.8  0.0 622:11 ksoftirqd_CPU2
   10 root      39  19     0    0     0 SWN  34.6  0.0 625:48 ksoftirqd_CPU3
10158 jbuchsba  15   0 17104  15M 12496 S     1.3  0.4  75:25 gnomemeeting
   24 root      15   0     0    0     0 SW    1.1  0.0  39:13 kjournald
 2376 root       6 -10  305M  48M 10592 S <   1.1  1.4  64:17 X
28321 jbuchsba  16   0  9728 9724  7120 S     1.1  0.2   2:10 gnome-terminal
  848 root      15   0   540  540   460 D     1.0  0.0  25:54 syslogd
31086 jbuchsba  15   0  1152 1152   784 R     1.0  0.0   0:45 top
 6927 jbuchsba  15   0  5952 5948  5016 D     0.5  0.1  33:20 magicdev
 6993 jbuchsba  15   0  7908 7904  6464 S     0.3  0.2   8:35 multiload-apple
 1515 root      15   0  8920 8920  8780 S     0.1  0.2   0:09 httpd
 6925 jbuchsba  15   0 12172  11M  8820 S     0.1  0.3   4:16 gnome-panel
    1 root      15   0   476  476   424 S     0.0  0.0   0:07 init
    2 root      0K   0     0    0     0 SW    0.0  0.0   0:00 migration_CPU0
    3 root      0K   0     0    0     0 SW    0.0  0.0   0:00 migration_CPU1
    4 root      0K   0     0    0     0 SW    0.0  0.0   0:00 migration_CPU2
    5 root      0K   0     0    0     0 SW    0.0  0.0   0:00 migration_CPU3


Version-Release number of selected component (if applicable):
2.4.2.4.18-17.8.0smp #1 SMP Tue Oct 8 12:39:01 EDT 2002 i686 i686 i386 GNU/Linux
18-17.8.0smp #1 SMP Tue Oct 8 12:39:01 EDT 2002 i686 i686 i386 GNU/Linux


How reproducible:
Always

Steps to Reproduce:
1.Boot machine.
2.Leave overnight.
3.See screensavers pause on returrn the next day...like a kiccup.
	

Actual Results:  Slow machine with jerky I/O.

Expected Results:  Super fast machine.

Additional info:

This is a bug seen on kernel bug mailing lists. RH, you need to fix this and put
a new kernel out!  I also have a huge memory leaak....not sure if Gnome terminal
or other device.....
Comment 1 Arjan van de Ven 2002-11-11 10:09:58 EST
do you have the nvidia binary only kernel modules loaded?
Comment 2 jeffrey.buchsbaum 2002-11-11 10:22:44 EST
Yes, I have the nvidia drivers, but I built them from their rpm src file.

jeff
Comment 3 Arjan van de Ven 2002-11-11 10:27:44 EST
ok here's the problem: ksoftirq uses cpu when you get a lot of interrupts
it seems (3D) screensavers trigger this for you; which is probably a bug in the
binary only nvidia driver.
If you can see in /proc/interrupts that another device is causing interrupts AND
you can reproduce this without the nvidia driver ever loaded, please reopen this
bug.

*** This bug has been marked as a duplicate of 73733 ***
Comment 4 jeffrey.buchsbaum 2002-11-11 10:36:40 EST
Here is a paste of my /proc/interrupts AFTER a reboot (this keyboard/mouse I/O
was driving me nuts!).


Thanks.  Perhaps NVIDIA will release their source to you someday????

jb


PASTE:

$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  0:     151043     149473     153043     145432    IO-APIC-edge  timer
  1:         35         35         36         37    IO-APIC-edge  keyboard
  2:          0          0          0          0          XT-PIC  cascade
  3:          3          1          2          0    IO-APIC-edge  serial
  8:          1          0          0          0    IO-APIC-edge  rtc
 12:        161        107        154        166    IO-APIC-edge  PS/2 Mouse
 14:       7233       7322       7287       6891    IO-APIC-edge  ide0
 15:       9847       9487       9876       9510    IO-APIC-edge  ide1
 16:      29220      28370      29065      26273   IO-APIC-level  nvidia
 18:      12713      12637      12737      12474   IO-APIC-level  SB Live
 19:        114         84        118        106   IO-APIC-level  aic7xxx, usb-u
hci
 20:          4          4          4          4   IO-APIC-level  aic7xxx
 23:      21556      20963      21382      21074   IO-APIC-level  usb-uhci, eth0
NMI:          0          0          0          0
LOC:     598883     598486     598873     598881
ERR:          0
MIS:          0
Comment 5 Mike A. Harris 2002-11-14 20:29:42 EST
Very unlikely that they would do that.
Comment 6 jeffrey.buchsbaum 2003-03-06 10:11:38 EST
A new version of the nvidia driver has the same problem, and other poeple report
this bug with ethernet drivers, etc.

So, I think the conclusion that it is the nvidia driver is not correct....

This is now happening on a dailly basis...and is a big, big problem for me and
my work.  It got MUCH worse with a recent a updfstab......more usb devices were
brought into fstab (flash card reader and zip 250 to be precise).



Pastes of /proc/interrupts:

A fresh rebooted system (9AM):

          CPU0       CPU1       CPU2       CPU3
  0:     218970     217810     215901     217999    IO-APIC-edge  timer
  1:        215        214        215        213    IO-APIC-edge  keyboard
  2:          0          0          0          0          XT-PIC  cascade
  8:          1          0          0          0    IO-APIC-edge  rtc
 12:       9975      10200       9906      10335    IO-APIC-edge  PS/2 Mouse
 14:       8070       8396       8386       8572    IO-APIC-edge  ide0
 15:      20151      19695      19402      19769    IO-APIC-edge  ide1
 16:      34475      33693      33058      33621   IO-APIC-level  nvidia
 18:      18857      18772      18662      18806   IO-APIC-level  SB Live
 19:        371        328        342        359   IO-APIC-level  aic7xxx, usb-u
hci
 20:          4          4          4          4   IO-APIC-level  aic7xxx
 23:      32973      33308      32834      33029   IO-APIC-level  usb-uhci, eth0
NMI:          0          0          0          0
LOC:     870392     870405     870399     870405
ERR:          0
MIS:          0



jb
Comment 7 Arjan van de Ven 2003-03-06 10:21:42 EST
The thing other people report is a orinoco_cs bug, we know about that.
Please try this without the nvidia module loaded AT ALL.
It appears your machine uses level interrupts for the irq nvidia uses... that
normally requires a validly written driver.

Comment 8 jeffrey.buchsbaum 2003-03-06 11:03:28 EST
Now, onee hour later....load is 18.22+,, ksoftirqd_CPU(0-3) are on the top of "top".

New cat /proc/interrupts

~]$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  0:     593823     603097     600104     593370    IO-APIC-edge  timer
  1:        395        385        390        385    IO-APIC-edge  keyboard
  2:          0          0          0          0          XT-PIC  cascade
  8:          1          0          0          0    IO-APIC-edge  rtc
 12:      17124      17104      16886      17210    IO-APIC-edge  PS/2 Mouse
 14:     105586     107111     106192     106488    IO-APIC-edge  ide0
 15:      43088      42864      42539      42595    IO-APIC-edge  ide1
 16:      97088      98183      98047      96355   IO-APIC-level  nvidia
 18:      53127      53994      53852      53255   IO-APIC-level  SB Live
 19:        371        328        342        359   IO-APIC-level  aic7xxx, usb-uhci
 20:          4          4          4          4   IO-APIC-level  aic7xxx
 23:      89931      92120      92063      90208   IO-APIC-level  usb-uhci, eth0
NMI:          0          0          0          0
LOC:    2390236    2390249    2390243    2390248
ERR:          0
MIS:          0
[jbuchsba@coil ~]$

Please send me explicit email as to what and how I  should proceed.....

Thannks.

jb
Comment 9 jeffrey.buchsbaum 2003-03-06 13:33:41 EST
OK,

I logged in remotely,
did telinit 3 as root
did an rmmod nvidia
and within 1 minute load was 0.01 from 17.00+,

so, I stand corrected and the bug is with nvidia....


ug.  

Jeff

(*it looks like I'll be buying a new video card...)
Comment 10 jeffrey.buchsbaum 2003-03-08 08:52:21 EST
Oddly, if I use windowmaker instead of the kde/gnome that comes with RedHat 8,
all is well...times two days (100% chance of cpu load at 20....by that point in
time).

This problem did NOT exist in 7.3..only 8.0.  I have not tried phoebe. 

openGL/nvidia is still loaded and now is fine....SO, the problem seems to be in
the lap of the RH8 gui.

Please comment.


BTW, if I used noapic I had the load moved from ksoftirqd_CPUx to
keventd/kjournald.....perhaps these/KDE via RH/ is/are broken.

TIA!


Jeff
Comment 11 jeffrey.buchsbaum 2003-03-10 09:01:54 EST
Update on this after 2 days of uptime.

NO HANG USING WINDOWMAKER...no other changes at all....making me think that the
bug is NOT with the nvidia drivers but with the redhat implementation of the gui.

I  checked with friends and mandrake and suse (current versions) with kde and
gnome do not hang on the same hardware.....so the issue is likely with the
"bluewave" stuff....

Please advise on a time frame to examine this and to check of the other
ksoftirqd issues on bugzilla are due to the same issue.

Thanks.

Jeff
Comment 12 Arjan van de Ven 2003-03-10 09:09:54 EST
> Please advise on a time frame to examine this and to check of the other
> ksoftirqd issues on bugzilla are due to the same issue.

any reports with nvidia kernel modules are ignored. it's not worth my time to
investigate interaction issues with this module we don't have code for.


*** This bug has been marked as a duplicate of 73733 ***
Comment 13 jeffrey.buchsbaum 2003-03-10 11:28:46 EST
If this bug is JUST due to Nvidia, why does it ONLY occur with the redhat
gnome/kde.....windowmaker is just fine?




JB
Comment 14 Arjan van de Ven 2003-03-10 11:34:12 EST
window maker might just use a subset of the drivers features.
Really, please stop reopening this. machines with the nvidia module loaded are
not supported and we CAN'T fix the module.

*** This bug has been marked as a duplicate of 73733 ***
Comment 15 jeffrey.buchsbaum 2003-03-12 17:59:31 EST
Ok...so you don't want me to reopen it if I use nvidia....

so I remove all the nvidia stuff and installed the XiG DX Platinum drivers (the
best...way better than X86Free..sorry).

Same problem.  

Freezes and problems in Gnome.  NONE in windowmaker.

The problem is clearly with the code by RedHat....please re-open and look at the
problem.

Jeff
Comment 16 Arjan van de Ven 2003-03-12 18:04:21 EST
ok so I need cat /proc/interrupts about 2 seconds from eachother WHEN THE
PROBLEM IS HAPPENING.
In addition it'll be useful to enable kernel profiling ("nmi_watchdog=1
profile=1" on the kernel commandline) and then
readprofile -r
sleep 10
readprofile -m /boot/System.map | sort -n

to show a list of functions where the kernel spends it's time
Comment 17 jeffrey.buchsbaum 2003-03-17 13:00:28 EST
So, right away a crash in the screensaver boxed (just trying to do this by
running the module manually in demo mode cannot get the crash...I tried for an
hour).

With the flags, the machine is now frozen (on the x11 console) solid.  I could
telnet in and find a poorly responsive, but alive, machine:

w got:

 11:56am  up 4 days,  2:48,  2 users,  load average: 13.48, 13.75, 13.87
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
jbuchsba :0       -                 8:03am   ?     0.00s   ?     -
jeffb    pts/3    slab             11:06am  0.00s  0.07s  0.02s  w 


Doing cat /proc/interrupts with about 2 seconds between hitting return got:
[jeffb@coil jeffb]$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:   45139083   45476391   43976219   45996759    IO-APIC-edge  timer
  1:        760        760        763        754    IO-APIC-edge  keyboard
  2:          0          0          0          0          XT-PIC  cascade
  8:          1          0          0          0    IO-APIC-edge  rtc
 12:      36206      36630      36252      36099    IO-APIC-edge  PS/2 Mouse
 14:     445461     278205      97800     513978    IO-APIC-edge  ide0
 15:      72197     246626      45307     249435    IO-APIC-edge  ide1
 18:    4130887    4161548    4024310    4209429   IO-APIC-level  SB Live
 19:     248791     315531      21520     417173   IO-APIC-level  aic7xxx, usb-uhci
 20:          4          4          4          4   IO-APIC-level  aic7xxx
 23:    6183082    6264429    5993566    6354479   IO-APIC-level  usb-uhci, eth0
NMI:  180588150  180588150  180588150  180588150 
LOC:  180603531  180603523  180603536  180603536 
ERR:          0
MIS:          1
[jeffb@coil jeffb]$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:   45140372   45476391   43976219   45997009    IO-APIC-edge  timer
  1:        760        760        763        754    IO-APIC-edge  keyboard
  2:          0          0          0          0          XT-PIC  cascade
  8:          1          0          0          0    IO-APIC-edge  rtc
 12:      36206      36630      36252      36099    IO-APIC-edge  PS/2 Mouse
 14:     445775     278205      97800     514276    IO-APIC-edge  ide0
 15:      72197     246626      45307     249435    IO-APIC-edge  ide1
 18:    4131007    4161548    4024310    4209450   IO-APIC-level  SB Live
 19:     249044     315531      21520     417220   IO-APIC-level  aic7xxx, usb-uhci
 20:          4          4          4          4   IO-APIC-level  aic7xxx
 23:    6183286    6264429    5993566    6354513   IO-APIC-level  usb-uhci, eth0
NMI:  180589689  180589689  180589689  180589689 
LOC:  180605070  180605063  180605076  180605075 
ERR:          0
MIS:          1



I am not sure what I need to do to get the above readprofile to work....I could
not find readprofile via which as root or as a user...is that a boot command?
(no, I am not a grub guru...  :-)  )  Please let me know what else I can do to
help get information up...the machine is crashed right now (always < 2 hours
after loading gnome).

jeff
Comment 18 jeffrey.buchsbaum 2003-03-17 13:03:21 EST
Clarification:

System has no NVIDIA software on board now. Just XiG platinum dx drivers....so
x86free is all changed...same bug......making in more likely to be a gnome issue.

The graphic card was swapped out as well....quadro4xgl900 was changed to an ATI
fire gl 8800.

Of note, the bug again is absent in windowmaker.

JB
Comment 19 jeffrey.buchsbaum 2003-03-17 15:29:48 EST
Per the request.

Jeff




[root@coil /]# cd /usr/sbin
[root@coil sbin]# ./readprofile -r
[root@coil sbin]# sleep 10
[root@coil sbin]# readprofile -m /boot/System.map | sort -n
bash: readprofile: command not found
[root@coil sbin]# ./readprofile -m /boot/System.map | sort -n
     1 add_blkdev_randomness                      0.0104
     1 add_timer                                  0.0104
     1 __alloc_pages                              0.0014
     1 atomic_dec_and_lock                        0.0145
     1 __block_commit_write                       0.0048
     1 buffer_insert_inode_queue                  0.0104
     1 call_reschedule_interrupt                  0.0909
     1 clear_page_tables                          0.0104
     1 collect_signal                             0.0039
     1 __constant_memcpy                          0.0037
     1 copy_strings                               0.0016
     1 del_timer                                  0.0104
     1 disk_round_stats                           0.0156
     1 do_gettimeofday                            0.0078
     1 do_mmap_pgoff                              0.0006
     1 do_select                                  0.0017
     1 emit_log_char                              0.0089
     1 end_buffer_io_sync                         0.0208
     1 eth_type_trans                             0.0052
     1 exit_mmap                                  0.0027
     1 fd_install                                 0.0125
     1 fib_lookup                                 0.0031
     1 filemap_fdatawait                          0.0042
     1 file_read_actor                            0.0039
     1 __find_lock_page_helper                    0.0057
     1 find_snap_client                           0.0125
     1 fn_hash_lookup                             0.0037
     1 __generic_copy_from_user                   0.0089
     1 generic_plug_device                        0.0089
     1 generic_unplug_device                      0.0125
     1 get_vm_area                                0.0045
     1 __global_save_flags                        0.0104
     1 handle_IRQ_event                           0.0063
     1 handle_mm_fault                            0.0030
     1 handle_stop_signal                         0.0063
     1 idle_cpu                                   0.0312
     1 internal_add_timer                         0.0057
     1 interruptible_sleep_on                     0.0078
     1 ip_route_input_slow                        0.0004
     1 IRQ0x17_interrupt                          0.0833
     1 kmap_high                                  0.0104
     1 kstat_read_proc                            0.0009
     1 locate_hd_struct                           0.0069
     1 lock_vma_mappings                          0.0208
     1 may_open                                   0.0031
     1 new_inode                                  0.0089
     1 page_cache_read                            0.0039
     1 proc_info_read                             0.0033
     1 proc_pid_lookup                            0.0019
     1 proc_pid_statm                             0.0023
     1 __read_lock_failed                         0.0500
     1 release_console_sem                        0.0057
     1 remove_wait_queue                          0.0312
     1 run_timer_list                             0.0025
     1 send_sig_info                              0.0045
     1 setup_frame                                0.0019
     1 setup_sigcontext                           0.0033
     1 smp_send_reschedule                        0.0156
     1 sockfd_lookup                              0.0078
     1 submit_bh                                  0.0078
     1 supplemental_group_member                  0.0156
     1 sys_select                                 0.0008
     1 sys_sigreturn                              0.0035
     1 task_dumpable                              0.0208
     1 tcp_transmit_skb                           0.0009
     1 .text.lock.acct                            0.0085
     1 .text.lock.ioctl                           0.0256
     1 .text.lock.printk                          0.0044
     1 .text.lock.readdir                         0.0097
     1 tty_read                                   0.0031
     1 unix_dgram_recvmsg                         0.0028
     1 unix_write_space                           0.0069
     1 update_wall_time_one_tick                  0.0057
     1 vfs_permission                             0.0031
     1 write_profile                              0.0063
     2 account_io_end                             0.0250
     2 batch_entropy_store                        0.0114
     2 blkdev_release_request                     0.0179
     2 __block_prepare_write                      0.0024
     2 __constant_c_and_count_memset              0.0125
     2 __constant_memcpy                          0.0074
     2 do_anonymous_page                          0.0054
     2 do_check_pgt_cache                         0.0096
     2 do_page_fault                              0.0016
     2 do_zap_page_range                          0.0052
     2 d_rehash                                   0.0179
     2 dup_mmap                                   0.0038
     2 end_level_ioapic_irq                       0.0057
     2 fget                                       0.0312
     2 __find_get_page                            0.0250
     2 fput                                       0.0063
     2 __free_pages_ok                            0.0024
     2 get_gendisk                                0.0312
     2 get_unused_fd                              0.0048
     2 ide_do_request                             0.0042
     2 ide_error                                  0.0043
     2 ide_set_handler                            0.0125
     2 inode_has_buffers                          0.0312
     2 iput                                       0.0028
     2 IRQ0x13_interrupt                          0.1667
     2 kfree                                      0.0104
     2 kmem_cache_free                            0.0139
     2 kunmap_high                                0.0156
     2 load_balance                               0.0021
     2 __make_request                             0.0012
     2 page_add_rmap                              0.0125
     2 rmqueue                                    0.0027
     2 __switch_to                                0.0078
     2 sys_fsync                                  0.0096
     2 .text.lock.locks                           0.0104
     2 unlock_page                                0.0179
     2 update_one_process                         0.0069
     2 __wait_on_buffer                           0.0125
     2 zap_pte_range                              0.0039
     3 __brelse                                   0.0938
     3 do_IRQ                                     0.0099
     3 do_signal                                  0.0044
     3 do_syslog                                  0.0032
     3 __free_pages                               0.0938
     3 generic_file_write                         0.0014
     3 mark_page_accessed                         0.0208
     3 page_remove_rmap                           0.0134
     3 pool_find_page                             0.0375
     3 real_lookup                                0.0094
     3 refile_buffer                              0.0625
     3 strnlen_user                               0.0441
     3 switch_mm                                  0.0093
     3 __tasklet_hi_schedule                      0.0312
     3 vsnprintf                                  0.0027
     4 copy_page_range                            0.0081
     4 d_alloc                                    0.0100
     4 dput                                       0.0096
     4 ide_intr                                   0.0100
     4 IRQ0x0e_interrupt                          0.3333
     4 pte_chain_free                             0.0357
     5 ide_end_request                            0.0240
     5 link_path_walk                             0.0026
     5 proc_lookup                                0.0223
     5 proc_pid_stat                              0.0047
     5 try_to_wake_up                             0.0116
     6 bh_action                                  0.0469
     6 __kmem_cache_alloc                         0.0197
     6 pci_pool_alloc                             0.0156
     6 reschedule_interrupt                       0.2857
     6 .text.lock.inode                           0.0123
     7 d_lookup                                   0.0230
     7 .text.lock.namei                           0.0059
     8 page_fault                                 0.6667
     8 __wake_up                                  0.0625
     9 pci_pool_free                              0.0331
     9 set_ioapic_affinity                        0.0511
    10 pte_chain_alloc                            0.1042
    11 smp_apic_timer_interrupt                   0.0491
    11 start_request                              0.0181
    12 get_hash_table                             0.0833
    12 unlock_buffer                              0.1500
    15 .text.lock.sched                           0.0285
    17 number                                     0.0197
    17 scheduler_tick                             0.0236
    19 ide_dma_intr                               0.0913
    29 ide_wait_stat                              0.0954
    35 invalidate_bdev                            0.0875
    39 apic_timer_interrupt                       1.6250
    55 ide_dmaproc                                0.0637
    95 do_rw_disk                                 0.0565
   133 statm_pte_range                            0.4030
   213 ret_from_sys_call                         12.5294
   350 ksoftirqd                                  1.2153
   611 restore_all                               40.7333
  3090 deliver_to_old_ones                       14.8558
  3955 schedule                                   5.2593
  4212 sys_sched_yield                           13.1625
  7421 do_softirq                                33.1295
  9534 system_call                              170.2500
  9981 tasklet_hi_action                         62.3813
 18811 default_idle                             235.1375
 20767 __rdtsc_delay                            648.9688
 79780 total                                      0.0549
Comment 20 jeffrey.buchsbaum 2003-03-17 18:12:30 EST
Just for completeness, without the crash/high load level...being logged into
windowmaker...I get:

[root@coil sbin]# ./readprofile -r
[root@coil sbin]# sleep 10
[root@coil sbin]# ./readprofile -m /boot/System.map | sort -n
     1 alloc_skb                                  0.0021
     1 __constant_c_and_count_memset              0.0063
     1 d_alloc                                    0.0025
     1 d_instantiate                              0.0104
     1 do_wp_page                                 0.0012
     1 follow_page                                0.0078
     1 fput                                       0.0031
     1 get_empty_filp                             0.0031
     1 IRQ0x12_interrupt                          0.0833
     1 IRQ0x17_interrupt                          0.0833
     1 kstat_read_proc                            0.0009
     1 link_path_walk                             0.0005
     1 __mark_inode_dirty                         0.0052
     1 poll_freewait                              0.0125
     1 proc_lookup                                0.0045
     1 pty_chars_in_buffer                        0.0125
     1 pty_unthrottle                             0.0089
     1 reschedule_interrupt                       0.0476
     1 rmqueue                                    0.0014
     1 set_ioapic_affinity                        0.0057
     1 sock_def_readable                          0.0069
     1 statm_pgd_range                            0.0052
     1 sys_close                                  0.0078
     1 sys_select                                 0.0008
     1 system_call                                0.0179
     1 udp_v4_mcast_deliver                       0.0023
     1 unix_ioctl                                 0.0048
     1 vsnprintf                                  0.0009
     1 __wake_up                                  0.0078
     2 d_lookup                                   0.0066
     2 load_balance                               0.0021
     2 netif_receive_skb                          0.0037
     2 proc_pid_make_inode                        0.0104
     2 write_profile                              0.0125
     3 atomic_dec_and_lock                        0.0435
     3 collect_sigign_sigcatch                    0.0234
     3 __kmem_cache_alloc                         0.0099
     3 proc_pid_statm                             0.0069
     4 number                                     0.0046
     4 smp_apic_timer_interrupt                   0.0179
     6 fget                                       0.0938
     7 proc_pid_stat                              0.0066
     9 scheduler_tick                             0.0125
    18 apic_timer_interrupt                       0.7500
    38 statm_pte_range                            0.1152
 60970 default_idle                             762.1250
 61107 total                                      0.0421
[root@coil sbin]# 

[jbuchsba@coil sbin]$ w
  5:10pm  up  2:37,  2 users,  load average: 0.06, 0.08, 0.03
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
jbuchsba :0       -                 5:07pm   ?     0.00s   ?     -
jbuchsba pts/1    -                 5:07pm  3:39   1.37s  1.37s  top 
[jbuchsba@coil sbin]$ 




Comment 21 jeffrey.buchsbaum 2003-03-21 11:18:18 EST
Update....

Up to the current kernel for 8.0.
Glibc 2.3.2x was updated/installed.


Same thing...

[root@coil sbin]# ./bugzillascript
     1 account_io_start                           0.0104
     1 batch_entropy_store                        0.0057
     1 __brelse                                   0.0312
     1 call_do_IRQ                                0.0769
     1 __constant_c_and_count_memset              0.0063
     1 __constant_memcpy                          0.0037
     1 copy_page_range                            0.0020
     1 d_lookup                                   0.0033
     1 do_fcntl                                   0.0013
     1 do_no_page                                 0.0015
     1 do_select                                  0.0017
     1 fget                                       0.0156
     1 filp_close                                 0.0048
     1 __generic_copy_to_user                     0.0125
     1 generic_file_write                         0.0005
     1 get_gendisk                                0.0156
     1 handle_mm_fault                            0.0030
     1 ide_destroy_dmatable                       0.0208
     1 inode_has_buffers                          0.0156
     1 ip_check_mc                                0.0156
     1 kfree                                      0.0052
     1 __kmem_cache_alloc                         0.0033
     1 mark_page_accessed                         0.0069
     1 new_inode                                  0.0089
     1 number                                     0.0012
     1 page_fault                                 0.0833
     1 proc_file_lseek                            0.0048
     1 proc_lookup                                0.0045
     1 proc_pid_cmdline                           0.0037
     1 prune_dcache                               0.0019
     1 reschedule_interrupt                       0.0476
     1 set_page_dirty                             0.0078
     1 skb_release_data                           0.0069
     1 sleep_on                                   0.0078
     1 switch_mm                                  0.0031
     1 sys_rt_sigprocmask                         0.0022
     1 __tasklet_hi_schedule                      0.0104
     1 .text.lock.namei                           0.0008
     1 .text.lock.socket                          0.0044
     2 do_IRQ                                     0.0066
     2 get_hash_table                             0.0139
     2 kunmap_high                                0.0156
     2 load_balance                               0.0021
     2 pte_chain_free                             0.0179
     2 schedule                                   0.0027
     2 set_ioapic_affinity                        0.0114
     2 statm_pgd_range                            0.0104
     2 system_call                                0.0357
     2 .text.lock.inode                           0.0041
     2 write_profile                              0.0125
     3 restore_all                                0.2000
     3 smp_apic_timer_interrupt                   0.0134
     3 __wake_up                                  0.0234
     4 ide_wait_stat                              0.0132
     4 kmap_high                                  0.0417
     4 pte_chain_alloc                            0.0417
     4 .text.lock.ioctl                           0.1026
     4 unlock_buffer                              0.0500
     5 bh_action                                  0.0391
     5 start_request                              0.0082
     6 apic_timer_interrupt                       0.2500
     7 ide_dmaproc                                0.0081
     7 scheduler_tick                             0.0097
    19 do_rw_disk                                 0.0113
    19 statm_pte_range                            0.0576
    23 ksoftirqd                                  0.0799
  1098 do_softirq                                 4.9018
  3050 tasklet_hi_action                         19.0625
  3602 .text.lock.dev                             8.8938
  3889 deliver_to_old_ones                       18.6971
  4180 __rdtsc_delay                            130.6250
  7861 default_idle                              98.2625
 23861 total   



Please respond/post ideas about this...as I am loosing a lot of work because of
this and might have to flee RedHat altogether.  My mandrake box at home has none
of this ...same set up, different obviously in sotware...

I really want to support RHL and just paid for additional support
yesterday...but I really think this is a big deal.....


jb
Comment 22 jeffrey.buchsbaum 2003-03-21 11:23:49 EST
Addendum: 
Logging out of gnome and logging in (to a "crashed" state) makes CPU load go to
near 0....here is the proc file:

[root@coil sbin]# ./bugzillascript
     1 copy_page_range                            0.0020
     1 d_alloc                                    0.0025
     1 __generic_copy_to_user                     0.0125
     1 get_user_pages                             0.0020
     1 ip_route_input                             0.0020
     1 iput                                       0.0014
     1 IRQ0x10_interrupt                          0.0833
     1 kfree                                      0.0052
     1 __kmem_cache_alloc                         0.0033
     1 kmem_cache_free                            0.0069
     1 link_path_walk                             0.0005
     1 netif_rx                                   0.0021
     1 new_inode                                  0.0089
     1 set_ioapic_affinity                        0.0057
     1 smp_apic_timer_interrupt                   0.0045
     1 sock_ioctl                                 0.0078
     1 sys_write                                  0.0031
     1 write_profile                              0.0063
     2 proc_pid_stat                              0.0019
     3 atomic_dec_and_lock                        0.0435
     4 scheduler_tick                             0.0056
     7 apic_timer_interrupt                       0.2917
    15 statm_pte_range                            0.0455
 20436 default_idle                             255.4500
 20485 total                                      0.0141

Hope this helps...it definitely is NOT video card/driver related...it is GNOME
related in the RedHat modification....Gnome on other linux brands does not do
this (same hardware).  Mandrake 9 is at home.....
Comment 23 jeffrey.buchsbaum 2003-03-21 11:32:43 EST
Addendum 2:
After logging into WindowMaker, if I log out and log into Gnome....it half opens
(RH menu fails to work, two terms to work, no desktop icons come up....most of
the menu bar is missing....)

Here is the procfile:

[root@coil sbin]# ./bugzillascript
     1 add_timer                                  0.0104
     1 atomic_dec_and_lock                        0.0145
     1 bh_action                                  0.0078
     1 blkdev_release_request                     0.0089
     1 __brelse                                   0.0312
     1 __constant_c_and_count_memset              0.0063
     1 do_munmap                                  0.0013
     1 do_no_page                                 0.0015
     1 do_page_fault                              0.0008
     1 do_readv_writev                            0.0014
     1 do_syslog                                  0.0011
     1 do_zap_page_range                          0.0026
     1 fget                                       0.0156
     1 __find_get_page                            0.0125
     1 __find_lock_page                           0.0208
     1 flush_signal_handlers                      0.0125
     1 free_one_pmd                               0.0048
     1 __free_pte                                 0.0089
     1 generic_plug_device                        0.0089
     1 get_empty_filp                             0.0031
     1 get_unmapped_area                          0.0033
     1 get_unused_buffer_head                     0.0057
     1 ide_do_request                             0.0021
     1 IRQ0x0e_interrupt                          0.0833
     1 link_path_walk                             0.0005
     1 lru_cache_add                              0.0057
     1 __make_request                             0.0006
     1 move                                       0.0069
     1 neigh_lookup                               0.0045
     1 page_add_rmap                              0.0063
     1 page_remove_rmap                           0.0045
     1 path_release                               0.0156
     1 pte_chain_free                             0.0089
     1 remove_wait_queue                          0.0312
     1 ret_from_sys_call                          0.0588
     1 skb_recv_datagram                          0.0042
     1 strncpy_from_user                          0.0089
     1 submit_bh                                  0.0078
     1 sys_read                                   0.0031
     1 sys_setsid                                 0.0078
     1 tcp_v4_init_sock                           0.0042
     1 .text.lock.locks                           0.0052
     1 try_to_wake_up                             0.0023
     1 unix_write_space                           0.0069
     1 vsnprintf                                  0.0009
     1 wake_up_forked_process                     0.0033
     1 write_profile                              0.0063
     1 zap_pte_range                              0.0019
     2 __constant_memcpy                          0.0074
     2 do_sigaction                               0.0057
     2 find_vma                                   0.0208
     2 fsync_buffers_list                         0.0036
     2 __generic_copy_to_user                     0.0250
     2 generic_file_write                         0.0009
     2 kunmap_high                                0.0156
     2 netif_receive_skb                          0.0037
     2 pte_chain_alloc                            0.0208
     2 schedule                                   0.0027
     2 update_one_process                         0.0069
     3 apic_timer_interrupt                       0.1250
     3 del_timer                                  0.0312
     3 handle_IRQ_event                           0.0187
     4 d_lookup                                   0.0132
     4 do_anonymous_page                          0.0109
     4 file_read_actor                            0.0156
     4 system_call                                0.0714
     4 unlock_buffer                              0.0500
     5 get_hash_table                             0.0347
     5 start_request                              0.0082
     6 ide_dmaproc                                0.0069
     7 __constant_c_and_count_memset              0.0437
     7 ide_wait_stat                              0.0230
     8 page_fault                                 0.6667
    16 do_rw_disk                                 0.0095
    22 ksoftirqd                                  0.0764
   914 do_softirq                                 4.0804
  2200 tasklet_hi_action                         13.7500
  2790 .text.lock.dev                             6.8889
  3044 deliver_to_old_ones                       14.6346
  3127 __rdtsc_delay                             97.7188
  7913 default_idle                              98.9125
 20163 total                                      0.0139


That is "it" for me today....hope this data helps you guys figure this one out...
Jeff

Comment 24 jeffrey.buchsbaum 2003-03-21 17:30:04 EST
Ok, I lied....

I did a net search on _rdtsc_delay and noticed it had to do with audio.  I also
noted that on gnome and not on windowmaker and applet called CDPlayer 2.01 was
on my menu....installed in 7.3 and all was well then.

Well, I removed this applet from Gnome 5 hours ago and no crash....


Please try this in the RH official office...on an SMP xeon machine if you have
one.....


Anyway, I will let it run over the weekend and see what happens....this might be
the culprit....


jb
Comment 25 jeffrey.buchsbaum 2003-03-25 08:29:11 EST
The crash is back....not luck....it is a kkernel thing, random, and bad.  Odd
that NO information is coming out of RH.....is this fixed in 9.0?


jb
Comment 26 Arjan van de Ven 2003-03-25 08:31:20 EST
could you try to rename the esd binary so that it doesn't auto-start?
sometimes it seems esd is causing very bad behavior
Comment 27 jeffrey.buchsbaum 2003-03-25 08:41:06 EST
Ok, rebooting after chaning esd's name.  Following the esd bug led me back to
bugzilla, and your kernels.....any chance they would fix this?

i.e.: http://people.redhat.com/arjanv/testkernels/i686/*smp*

Thanks.

Jeff
Comment 28 jeffrey.buchsbaum 2003-03-25 08:45:00 EST
PS: I have the latest kernel.....as of 3/24/03.

jb
Comment 29 jeffrey.buchsbaum 2003-04-02 10:41:20 EST
Well, for other reason, I decided to put kde 3.1 on my box via apt-get (the rpm
out there....).

No problem getting KDE.  Funny, the whole of my box stopped crashing even in
gnome ....so, I have no idea what was wrong, esd was not at fault....., but it
seems to be gone now x 5 days.


Perhaps rh9 with kde 3.1 (right?) will have my issue fixed by default...

I would leave this bug as solved via kde upgrade with no known direct cause.


Jeff
Comment 30 Mike A. Harris 2003-04-19 04:47:04 EDT
>System has no NVIDIA software on board now. Just XiG platinum dx drivers....so
>x86free is all changed...same bug......making in more likely to be a gnome >issue.

We don't support _ANY_ 3rd party drivers.  We support only the drivers which
we ship with XFree86.

As has been stated several times, this issue is not something we will support
in any way, as we do not support 3rd party kernel modules or XFree86 drivers.

Closing (for the 4th or so time) as a duplicate of bug #73733
Comment 31 Mike A. Harris 2003-04-19 04:47:21 EDT

*** This bug has been marked as a duplicate of 73733 ***
Comment 32 Red Hat Bugzilla 2006-02-21 13:50:07 EST
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.

Note You need to log in before you can comment on or make changes to this bug.