Bug 88866 - Chronic instability with shrike (hard locks)
Summary: Chronic instability with shrike (hard locks)
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 9
Hardware: i586
OS: Linux
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
Depends On:
Reported: 2003-04-15 01:50 UTC by Need Real Name
Modified: 2007-04-18 16:53 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2003-07-28 16:00:51 UTC

Description Need Real Name 2003-04-15 01:50:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
Since installing shrike a week ago I have experoienced hard lockups 2-8 times a day.

Occasionally it is an oops, but more commonly a hard lock with no input possible
(and the system locked solid)
I have tried -2.4.20-9 as well as the standard -8 kernel

Version-Release number of selected component (if applicable):
2.4.20-9, 2.4.20-8,2.4.20-54

How reproducible:

Steps to Reproduce:
1. Have the computer on 

Actual Results:  Hard lock after a period of time which varies, sometimes
immediate, sometimes after a few hours

Expected Results:  no lock

Additional info:

Comment 1 Michael Lee Yohe 2003-04-15 14:36:18 UTC
This bug is not very descriptive.  It would be a safe assumption that RHL9 is
not locking most machines out there (else it would have made CNN!)  A brief
description of the contents of your computer would be helpful.

You should include _at the very least_ the following:

1) Summary of your hardware configuration.
2) The output of "dmesg" after booting up.
3) The output of "lsmod" after booting up.
4) The contents of /proc/pci.

Comment 2 Need Real Name 2003-04-15 17:00:55 UTC
Hardware amd k6/2 450 geforce 440mx 256mb ram ali1543mb


PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: Acer Laboratories Inc. [ALi] M1541 (rev 4).
      Master Capable.  Latency=32.
      Non-prefetchable 32 bit memory at 0x0 [0x1ffffff].
  Bus  0, device   1, function  0:
    PCI bridge: Acer Laboratories Inc. [ALi] M1541 PCI to AGP Controller (rev 4).
      Master Capable.  Latency=32.  Min Gnt=14.
  Bus  0, device   7, function  0:
    ISA bridge: Acer Laboratories Inc. [ALi] M1533 PCI to ISA Bridge [Aladdin
IV] (rev 195).
  Bus  0, device   8, function  0:
    Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
(rev 16).
      IRQ 10.
      Master Capable.  Latency=32.  Min Gnt=32.Max Lat=64.
      I/O at 0x2000 [0x20ff].
      Non-prefetchable 32 bit memory at 0xe4001000 [0xe40010ff].
  Bus  0, device  15, function  0:
    IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev 194).
      IRQ 10.
      Master Capable.  Latency=32.  Min Gnt=2.Max Lat=4.
      I/O at 0x2400 [0x240f].
  Bus  1, device   0, function  0:
    VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX440] (rev 163).
      IRQ 11.
      Master Capable.  Latency=32.  Min Gnt=5.Max Lat=1.
      Non-prefetchable 32 bit memory at 0xe2000000 [0xe2ffffff].
      Prefetchable 32 bit memory at 0xd0000000 [0xd7ffffff].
      Prefetchable 32 bit memory at 0xd8000000 [0xd807ffff].

mousedev                5204   1 (autoclean)
input                   5696   0 (autoclean) [mousedev]
lp                      8356   0 (autoclean)
parport                34688   0 (autoclean) [lp]
autofs                 12404   0 (autoclean) (unused)
8139too                17032   1
mii                     3688   0 [8139too]
ipt_REJECT              3640   6 (autoclean)
iptable_filter          2316   1 (autoclean)
ip_tables              14392   2 [ipt_REJECT iptable_filter]
sg                     34636   0 (autoclean)
sr_mod                 16408   0 (autoclean)
ide-scsi               11088   0
scsi_mod              102680   3 [sg sr_mod ide-scsi]
ide-cd                 33500   0
cdrom                  30848   0 [sr_mod ide-cd]
ext3                   64256   5
jbd                    48436   5 [ext3]


Linux version 2.4.20-9custom (root@localhost.localdomain) (gcc version 3.2.2
20030222 (Red Hat Linux 3.2.2-5)) #1 Tue Apr 15 03:24:02 BST 2003
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000010000000 (usable)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
256MB LOWMEM available.
On node 0 totalpages: 65536
zone(0): 4096 pages.
zone(1): 61440 pages.
zone(2): 0 pages.
Kernel command line: ro root=LABEL=/ hda=ide-scsi
ide_setup: hda=ide-scsi
Initializing CPU#0
Detected 449.934 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 897.84 BogoMIPS
Memory: 254048k/262144k available (1190k kernel code, 5660k reserved, 308k data,
284k init, 0k highmem)
Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
Enabling new style K6 write allocation for 256 Mb
CPU: L1 I Cache: 32K (32 bytes/line), D cache 32K (32 bytes/line)
CPU:     After generic, caps: 008021bf 808029bf 00000000 00000002
CPU:             Common caps: 008021bf 808029bf 00000000 00000002
CPU: AMD-K6(tm) 3D processor stepping 0c
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: AMD K6
PCI: PCI BIOS revision 2.10 entry at 0xfb500, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router ALI [10b9/1533] at 00:07.0
isapnp: Scanning for PnP cards...
isapnp: Card 'Crystal_4237B'
isapnp: Card 'U.S. Robotics 56K Voice INT'
isapnp: 2 Plug & Play cards detected total
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16)
Starting kswapd
VFS: Disk quotas vdquot_6.5.1
pty: 512 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ
ttyS1 at 0x02f8 (irq = 3) is a 16550A
ttyS0 at port 0x03f8 (irq = 4) is a 16550A
Real Time Clock Driver v1.10e
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
NET4: Frame Diverter 0.46
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00beta-2.4
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ALI15X3: IDE controller at PCI slot 00:0f.0
PCI: Assigned IRQ 10 for device 00:0f.0
ALI15X3: chipset revision 194
ALI15X3: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x2400-0x2407, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0x2408-0x240f, BIOS settings: hdc:DMA, hdd:pio
hda: ARTEC WRR-52X 1.13 20021212, ATAPI CD/DVD-ROM drive
hdb: FX320S, ATAPI CD/DVD-ROM drive
hdc: ST36421A, ATA DISK drive
blk: queue c030c8e0, I/O limit 4095Mb (mask 0xffffffff)
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hdc: host protected area => 1
hdc: 12596850 sectors (6450 MB) w/256KiB Cache, CHS=13330/15/63, UDMA(33)
ide-floppy driver 0.99.newide
Partition check:
 hdc: [PTBL] [833/240/63] hdc1 hdc2 hdc3 hdc4 < hdc5 hdc6 hdc7 >
ide-floppy driver 0.99.newide
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 140k freed
VFS: Mounted root (ext2 filesystem).
Journalled Block Device driver loaded
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: ide1(22,5): orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 169294
ext3_orphan_cleanup: deleting unreferenced inode 169293
ext3_orphan_cleanup: deleting unreferenced inode 169292
EXT3-fs: ide1(22,5): 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Freeing unused kernel memory: 284k freed
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,5), internal journal
Adding Swap: 234320k swap-space (priority -1)
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,2), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,3), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,6), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
hdb: ATAPI 32X CD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.12
SCSI subsystem driver Revision: 1.00
scsi0 : SCSI host adapter emulation for IDE ATAPI devices
  Vendor: ARTEC     Model: WRR-52X           Rev: 1.13
  Type:   CD-ROM                             ANSI SCSI revision: 02
Attached scsi CD-ROM sr0 at scsi0, channel 0, id 0, lun 0
sr0: scsi3-mmc drive: 52x/52x writer cd/rw xa/form2 cdda tray
ip_tables: (C) 2000-2002 Netfilter core team
8139too Fast Ethernet driver 0.9.26
PCI: Found IRQ 10 for device 00:08.0
divert: allocating divert_blk for eth0
eth0: RealTek RTL8139 Fast Ethernet at 0xd0896000, 00:90:47:02:81:dd, IRQ 10
eth0:  Identified 8139 chip type 'RTL-8139C'
eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability
41e1.lp: driver loaded but no devices found
mice: PS/2 mouse device common for all mice

Comment 3 Need Real Name 2003-04-18 15:18:15 UTC
Sample Ooops

Printing eip:
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0060 [<d08be820>] Not Tainted
EIP is at (2.4.20-9custom)
eax: 00000000 ebx: cde37f20 ecx:00000009 edx: 00000009 esl: 00000001 edi:
c90c5fc4 ebp: 00000009 esp: c90c5f80 
ds: 0068 es: 0068  ss: 0068
process gnome-volume-co (pid: 12388, stackpage=c90c5000)
stack: c010a392 00000009 00000000 c90c5fc4 c02c1a20 00000009 c90cfc4 cde37f20
       c010a4fa 00000009 c90c5fc4 cde37f20 406aebdc 00000000 0806640c 6ffff058
       c010cd58 406aebdc 00000001 00000001 00000000 0806640c 6ffff058 00000000
Call Trace:[<c010a392>] (oxc90c5f80)
[<c010a4fa>] (0x90c5fa0)
[<c010cd58>] (0x90c5fa0)
Code 55 bd ff 00 00 00 57 6f 00 00 00 00 56 53 8b 54 24 18 86 04
<0> Kernel Panic Aieee, killing Interrupt Handler
In Interrupt Handler - not syncing

Comment 4 Arjan van de Ven 2003-04-18 15:36:15 UTC
unfortionatly in your custom compile you disabled CONFIG_KALLSYMS (which is used
to automatically resolve function names to those hex addresses)

Comment 5 Need Real Name 2003-07-28 16:00:51 UTC
Apologies - in the end this turned out to be a fan that worked when it felt 
like it.

Since I got a new fan everything is fine again, just co-incidence in timing

