Bug 97833 - (IDE PDC202XX_NEW)Data corruption with Promise FastTrak and latest 2.4.20-18.7smp kernel
(IDE PDC202XX_NEW)Data corruption with Promise FastTrak and latest 2.4.20-18....
Status: CLOSED WONTFIX
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-06-22 19:17 EDT by Matthias Saou
Modified: 2007-04-18 12:54 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:41:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matthias Saou 2003-06-22 19:17:13 EDT
After upgrading to the latest errata kernel, 2.4.20-18.7smp, some Intel 1U
servers with an onboard Promise FastTrak IDE RAID controller started
experiencing sudden data corruption on some partitions, a few days after the
upgrade. None of the servers still running 2.4.18-24.7.xsmp or 2.4.18-27.7.xsmp
showed problems, and all servers had been running for 4-5 months with no similar
problems prior to that.

Here are the most common log entries encountered :

Jun 20 23:25:07 erpweb05 kernel: EXT3-fs error (device ataraid(114,7)):
ext3_new_block: Allocating block in system zone - block = 6586369
Jun 20 23:25:07 erpweb05 kernel: EXT3-fs error (device ataraid(114,7)):
ext3_new_block: Allocating block in system zone - block = 6586376
Jun 20 23:25:08 erpweb05 kernel: EXT3-fs error (device ataraid(114,7)):
ext3_new_block: Allocating block in system zone - block = 6586377
Jun 20 23:25:08 erpweb05 kernel: EXT3-fs error (device ataraid(114,7)):
ext3_new_block: Allocating block in system zone - block = 6586378
Jun 20 23:25:08 erpweb05 kernel: EXT3-fs error (device ataraid(114,7)):
ext3_new_block: Allocating block in system zone - block = 6586379

The symptoms were the same for all 3 servers that suffered corruption : At once,
one partition starting having severe problems, when it was /var not much
happened, but when it was /usr many processes started dying. Running fsck then
trashes most of the filesystem, and many many directories end up as #<some
number> in lost+found.

Here is the detailed hardware and kernel module information :

00:02.0 RAID bus controller: Promise Technology, Inc. 20267 (rev 02)
        Subsystem: Intel Corp.: Unknown device 3410
        Flags: bus master, medium devsel, latency 64, IRQ 19
        I/O ports at 1400 [size=8]
        I/O ports at 1408 [size=4]
        I/O ports at 1410 [size=8]
        I/O ports at 140c [size=4]
        I/O ports at 1440 [size=64]
        Memory at fe7a0000 (32-bit, non-prefetchable) [size=128K]
        Expansion ROM at fe7e0000 [disabled] [size=64K]
        Capabilities: [58] Power Management version 1

# df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/ataraid/d0p1       202220    135856     55924  71% /
none                    515264         0    515264   0% /dev/shm
/dev/ataraid/d0p7       497829      8239    463888   2% /tmp
/dev/ataraid/d0p5      3067568    345672   2566068  12% /usr
/dev/ataraid/d0p8     72975932  19856400  49412536  29% /var

# lsmod
Module                  Size  Used by    Not tainted
nfs                    86048   1  (autoclean)
lockd                  55904   1  (autoclean) [nfs]
sunrpc                 79252   1  (autoclean) [nfs lockd]
eepro100               20720   1
iptable_filter          2464   1  (autoclean)
ip_tables              14464   1  [iptable_filter]
ide-cd                 30208   0  (autoclean)
cdrom                  32384   0  (autoclean) [ide-cd]
usb-ohci               20896   0  (unused)
usbcore                74400   1  [usb-ohci]
pdcraid                14144   5
ataraid                 8736   5  [pdcraid]
ext3                   67392   3
jbd                    51528   3  [ext3]

All servers are now downgraded to 2.4.18-27.7.xsmp.

Matthias
Comment 1 Bugzilla owner 2004-09-30 11:41:11 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.