Bug 2534 - kernel oopses in random processes related to disk activity sometimes resulting in data-loss
kernel oopses in random processes related to disk activity sometimes resultin...
Status: CLOSED WORKSFORME
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
5.2
i386 Linux
medium Severity high
: ---
: ---
Assigned To: David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 1999-05-04 04:54 EDT by Hans Otten
Modified: 2008-05-01 11:37 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 1999-05-04 10:41:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Hans Otten 1999-05-04 04:54:43 EDT
Hi,

In the past I have had two kernel oopses and one file
system inconsistency on a RH Linux 5.2 system, full
install. Kernel package was kernel-2.0.36-0.7.i386.rpm.
Below is the information found in the syslog with respect
to one of the oopses:

Apr 29 13:58:52 tcn-server kernel: general protection: 0000
Apr 29 13:58:52 tcn-server kernel: CPU:    0
Apr 29 13:58:52 tcn-server kernel: EIP:
0010:[wake_up+44/240]
Apr 29 13:58:52 tcn-server kernel: EFLAGS: 00010296
Apr 29 13:58:52 tcn-server kernel: eax: 00000000   ebx:
c10405c6   ecx: 00cf374c   edx: 010804c1
Apr 29 13:58:52 tcn-server kernel: esi: 000003ce   edi:
00cf3748   ebp: 015b1f6c   esp: 015b1f60
Apr 29 13:58:52 tcn-server kernel: ds: 0018   es: 0018
fs: 002b   gs: 002b   ss: 0018
Apr 29 13:58:52 tcn-server kernel: Process update (pid:
281, process nr: 21, stackpage=015b1000)
Apr 29 13:58:52 tcn-server kernel: Stack: 00cf3700 000003ce
00000000 bffffe6c 00124516 00cf374c 00cf3700 001247d7
Apr 29 13:58:52 tcn-server kernel:        00cf3700 01581810
00000000 00000000 00127858 00000000 00000000 01581810
Apr 29 13:58:52 tcn-server kernel:        00000000 00000000
00715c18 001279b5 01581810 00000001 0010abc5 00000001
Apr 29 13:58:52 tcn-server kernel: Call Trace:
[write_inode+110/116] [sync_inodes+63/92]
[sync_old_buffers+20/316] [sys_bdflush+53/152]
[system_call+85/124]
Apr 29 13:58:52 tcn-server kernel: Code: 8b 13 8b 5b 04 85
d2 74 76 8b 02 83 f8 02 74 07 8b 02 83 f8

I had the following partitions:
Disk /dev/hda: 255 heads, 63 sectors, 524 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start      End   Blocks   Id  System
/dev/hda1   *         1      261  (Windows 98)
/dev/hda2           262      524  2112547+   5  Extended
/dev/hda5           262      392  1052226   83  Linux native
/dev/hda6           393      401    72261   82  Linux swap
/dev/hda7           402      524   987966   83  Linux native

/dev/hda5 was mounted on /
/dev/hda7 was mounted on /home

startup information:
Apr 29 10:04:11 tcn-server syslogd 1.3-3: restart.
Apr 29 10:04:11 tcn-server kernel: klogd 1.3-3, log source
= /proc/kmsg started.
Apr 29 10:04:11 tcn-server kernel: Loaded 4253 symbols from
/boot/System.map.
Apr 29 10:04:11 tcn-server kernel: Symbols match kernel
version 2.0.36.
Apr 29 10:04:11 tcn-server kernel: Loaded 12 symbols from 6
modules.
Apr 29 10:04:11 tcn-server kernel: Memory: sized by int13
088h
Apr 29 10:04:11 tcn-server kernel: Console: 16 point font,
400 scans
Apr 29 10:04:11 tcn-server kernel: Console: colour VGA+
80x25, 1 virtual console (max 63)
Apr 29 10:04:11 tcn-server kernel: pcibios_init : BIOS32
Service Directory structure at 0x000ffe80
Apr 29 10:04:11 tcn-server kernel: pcibios_init : BIOS32
Service Directory entry at 0xffe90
Apr 29 10:04:11 tcn-server kernel: pcibios_init : PCI BIOS
revision 2.10 entry at 0xfca8e
Apr 29 10:04:11 tcn-server kernel: Probing PCI hardware.
Apr 29 10:04:11 tcn-server kernel: Calibrating delay loop..
ok - 398.13 BogoMIPS
Apr 29 10:04:11 tcn-server kernel: Memory: 30824k/32768k
available (748k kernel code, 384k reserved, 812k data)
Apr 29 10:04:11 tcn-server kernel: Swansea University
Computer Society NET3.035 for Linux 2.0
Apr 29 10:04:11 tcn-server kernel: NET3: Unix domain
sockets 0.13 for Linux NET3.035.
Apr 29 10:04:11 tcn-server kernel: Swansea University
Computer Society TCP/IP for NET3.034
Apr 29 10:04:11 tcn-server kernel: IP Protocols: IGMP,
ICMP, UDP, TCP
Apr 29 10:04:11 tcn-server kernel: Linux IP multicast
router 0.07.
Apr 29 10:04:11 tcn-server kernel: VFS: Diskquotas version
dquot_5.6.0 initialized
Apr 29 10:04:11 tcn-server kernel:
Apr 29 10:04:11 tcn-server kernel: Checking 386/387
coupling... Ok, fpu using exception 16 error reporting.
Apr 29 10:04:11 tcn-server kernel: Checking 'hlt'
instruction... Ok.
Apr 29 10:04:11 tcn-server kernel: Linux version 2.0.36
(root@porky.redhat.com) (gcc version 2.7.2.3) #1 Tue Oct 13
22:17:11 EDT 1998
Apr 29 10:04:11 tcn-server kernel: Starting kswapd v
1.4.2.2
Apr 29 10:04:11 tcn-server kernel: Serial driver version
4.13 with no serial options enabled
Apr 29 10:04:11 tcn-server kernel: tty00 at 0x03f8 (irq =
4) is a 16550A
Apr 29 10:04:11 tcn-server kernel: tty01 at 0x02f8 (irq =
3) is a 16550A
Apr 29 10:04:11 tcn-server kernel: PS/2 auxiliary pointing
device detected -- driver installed.
Apr 29 10:04:11 tcn-server kernel: Real Time Clock Driver
v1.09
Apr 29 10:04:11 tcn-server kernel: Ramdisk driver
initialized : 16 ramdisks of 4096K size
Apr 29 10:04:11 tcn-server kernel: ide: i82371 PIIX
(Triton) on PCI bus 0 function 57
Apr 29 10:04:11 tcn-server kernel:     ide0: BM-DMA at
0xffa0-0xffa7
Apr 29 10:04:11 tcn-server kernel:     ide1: BM-DMA at
0xffa8-0xffaf
Apr 29 10:04:11 tcn-server kernel: hda: WDC AC14300R,
4112MB w/512kB Cache, CHS=524/255/63, UDMA
Apr 29 10:04:11 tcn-server kernel: hdc: TOSHIBA CD-ROM XM-
6402B, ATAPI CDROM drive
Apr 29 10:04:11 tcn-server kernel: ide0 at 0x1f0-
0x1f7,0x3f6 on irq 14
Apr 29 10:04:11 tcn-server kernel: ide1 at 0x170-
0x177,0x376 on irq 15
Apr 29 10:04:11 tcn-server kernel: Floppy drive(s): fd0 is
1.44M
Apr 29 10:04:11 tcn-server kernel: FDC 0 is a National
Semiconductor PC87306
Apr 29 10:04:11 tcn-server kernel: md driver 0.36.3
MAX_MD_DEV=4, MAX_REAL=8
Apr 29 10:04:11 tcn-server kernel: scsi : 0 hosts.
Apr 29 10:04:11 tcn-server kernel: scsi : detected total.
Apr 29 10:04:11 tcn-server kernel: Partition check:
Apr 29 10:04:11 tcn-server kernel:  hda: hda1 hda2 < hda5
hda6 hda7 >
Apr 29 10:04:11 tcn-server kernel: VFS: Mounted root (ext2
filesystem) readonly.
Apr 29 10:04:11 tcn-server kernel: Adding Swap: 72256k swap-
space (priority -1)
Apr 29 10:04:11 tcn-server kernel: sysctl: ip forwarding off
Apr 29 10:04:11 tcn-server kernel: Swansea University
Computer Society IPX 0.34 for NET3.035
Apr 29 10:04:11 tcn-server kernel: IPX Portions Copyright
(c) 1995 Caldera, Inc.
Apr 29 10:04:11 tcn-server kernel: Appletalk 0.17 for Linux
NET3.035
Apr 29 10:04:11 tcn-server kernel: 3c59x.c:v0.99E 5/12/98
Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html
Apr 29 10:04:11 tcn-server kernel: eth0: 3Com 3c905B
Cyclone 100baseTx at 0xdc00, 00:c0:4f:4b:3c:a3, IRQ 11
Apr 29 10:04:11 tcn-server kernel:   8K byte-wide RAM 5:3
Rx:Tx split, autoselect/NWay Autonegotiation interface.
Apr 29 10:04:11 tcn-server kernel:   Enabling bus-master
transmits and whole-frame receives.

I have since killed the Windows partition and reinstalled
Linux in primary partitions. I have also upgraded to kernel
package kernel-2.0.36-3.i386.rpm. As a test I compiled a
kernel as this involves quiet a lot of disk activity, and
all went well.

I see two possibilities:
o running Linux from extended partitions might be a problem
o between kernel packages 2.0.36-0.7 and 2.0.36-3
improvements where made to the filesystem and/or IDE
drivers that fixed my problem.

I still have the original installation that generated the
problems at hand, if anyone is interested.

My main interest is in the interpretation of the kernel
oops.
Comment 1 Hans Otten 1999-05-04 10:41:59 EDT
19990504:
  file system corruption even with kernel-2.0.36-3.i386 and primary
partitions. I am now now suspecting some unfortunate interaction
between cdp (curses based audio CD player) and UDMA. cdp is cdp-0.33-
10.i386
Comment 2 Alan Cox 2000-08-08 09:25:27 EDT
Im assuming this unrelated bug was hardware. If 6.x is doing the same run
memtest86 and if that
says the box is ok file/reopen a bug

And no I have no idea how this bug stayed open so long..

Note You need to log in before you can comment on or make changes to this bug.