Red Hat Bugzilla – Bug 484672
Kernel 22.214.171.124-170.2.5.fc10.x86_64 frequent panics
Last modified: 2009-03-10 17:00:15 EDT
Description of problem:
Since installing kernel-126.96.36.199-170.2.5.fc10.x86_64, the system panics frequently leaving nothing in logs. Screen trace not captured, but indicates problems within file system area of the kernel.
Version-Release number of selected component (if applicable):
Upgrade to kernel-188.8.131.52-170.2.5.fc10.x86_64 then use the system for a week.
Steps to Reproduce:
Kernel panics, prints lots of detail only to console screen. The trace suggests that trouble originates within file system area of the kernel. Sorry, console screen contents not captured.
Trying to back off to kernel-184.108.40.206-159.fc10.x86_64 which performed stably. The machine is a server with 6 SATA drives in RAID6 configuration, must be reliable.
If we don't have a the text of the error messages there's not much that can be done about the bug...
OK, two problems:
(1) How do you capture screen text? There is nothing in the logs.
(2) This machine must be reliable and has been downgraded to the earlier kernel which worked fine.
It happened again, but this time with kernel-220.127.116.11-159.fc10.x86_64.rpm (the previous version -- my next step is to try even older kernel-18.104.22.168-134.fc10.x86_64).
The machine locks up and the kernel prints backtrace from _spin_lock about once per minute, so this isn't quite kernel panic, more like periodic detection of deadlocks related to the file system. There are two patterns of backtrace:
I copied this by hand from the console, so there may be typos. The other pattern is:
Both of the above look like some kind of deadlock related to the file system. There are no indications of problems in system logs, and "smartctl -a ..." shows that none of the drives had any errors.h
The machine has the following parts:
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 3 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation Device 2833 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation GeForce 7100 GS (rev a1)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
03:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
04:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)
06:01.0 Ethernet controller: ADMtek NC100 Network Everywhere Fast Ethernet 10/100 (rev 11)
06:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)
06:04.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 14)
Each of the 6 ports of the Intel SATA controller has a 500 GB drive attached. The JMicron SATA controller has a DVD-ROM attached. The processor is an Intel Core Duo 6700 @ 2.66 GHz. The BIOS is AMI version 1226 rev. 8.12 released 11/23/2007.
Please close this bug: Memory on the system is going bad.
Memtest86+ originally confirmed that memory was good (about 2 years ago), but re-testing shows frequent errors of the leading bit. Re-seating DIMMs didn't help. New memory is on order.