Bug 100669 - Got ext3 errors after building up a fresh large filesystem
Got ext3 errors after building up a fresh large filesystem
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
9
i686 Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-07-24 02:10 EDT by Klaus Steinberger
Modified: 2005-10-31 17:00 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-08-04 08:37:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Output from e2fsck after that, the filesystem is severly corrupted (7.07 KB, text/plain)
2003-07-24 02:13 EDT, Klaus Steinberger
no flags Details

  None (edit)
Description Klaus Steinberger 2003-07-24 02:10:15 EDT
Description of problem:

We build up a server with a large RAID array. We tried to transfer large
filesystems (~ 100 GByte) to this server. We tried both rsync, nfs and restoring
from a Tivoli Backup Server. After building up the filesystem, we got
reproducible the following ext3 errors at 4:06 o'clock:


Jul 24 04:06:07 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #2768897: rec_len %% 4 != 0 - offset=900, inode=17956892,
rec_len=30583, name_len=49
Jul 24 04:07:00 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #19316737: rec_len %% 4 != 0 - offset=2448,
inode=17760280, rec_len=17489, name_len=114
Jul 24 04:08:10 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #7110658: rec_len %% 4 != 0 - offset=820,
inode=1835166060, rec_len=26478, name_len=95
Jul 24 04:08:11 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #6815746: rec_len %% 4 != 0 - offset=88,
inode=1886545774, rec_len=26670, name_len=0
Jul 24 04:08:11 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #6864898: directory entry across blocks - offset=2240,
inode=1668572005, rec_len=25964, name_len=97
Jul 24 04:09:20 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #9568259: rec_len is too small for name_len - offset=24,
inode=9568309, rec_len=36, name_len=59
Jul 24 04:10:08 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #6406148: rec_len %% 4 != 0 - offset=264,
inode=1248159828, rec_len=29797, name_len=95
Jul 24 04:10:11 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #16154628: rec_len is too small for name_len -
offset=2760, inode=16154711, rec_len=36, name_len=59
Jul 24 04:10:50 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #18825222: rec_len is too small for name_len - offset=64,
inode=18825232, rec_len=32, name_len=55
Jul 24 04:10:51 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #18939910: rec_len %% 4 != 0 - offset=112,
inode=1819244133, rec_len=29813, name_len=105
Jul 24 04:10:55 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
directory #2342919 contains a hole at offset 4096
Jul 24 04:11:16 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #12304392: rec_len is too small for name_len -
offset=368, inode=12304434, rec_len=36, name_len=58
Jul 24 04:11:26 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #2195465: rec_len %% 4 != 0 - offset=192, inode=29811,
rec_len=32783, name_len=33
Jul 24 04:11:30 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #11468809: rec_len %% 4 != 0 - offset=1304,
inode=1634102127, rec_len=29811, name_len=95
Jul 24 04:11:39 etprd01 kernel: EXT3-fs error (device lvm(58,0)): ext3_readdir:
bad entry in directory #24969225: inode out of bounds - offset=1532,
inode=1836345390, rec_len=108, name_len=0
Jul 24 04:12:08 etprd01 kernel: EXT3-fs error (device lvm(58,1)): ext3_readdir:
bad entry in directory #2228225: rec_len is too small for name_len - offset=532,
inode=2228243, rec_len=40, name_len=63
Jul 24 04:12:09 etprd01 kernel: EXT3-fs error (device lvm(58,1)): ext3_readdir:
bad entry in directory #3620865: rec_len is too small for name_len - offset=120,
inode=3620870, rec_len=36, name_len=59
Jul 24 04:12:14 etprd01 kernel: EXT3-fs error (device lvm(58,1)): ext3_readdir:
bad entry in directory #7110657: rec_len is too small for name_len - offset=152,
inode=7110661, rec_len=32, name_len=53
Jul 24 04:12:22 etprd01 kernel: EXT3-fs error (device lvm(58,1)): ext3_readdir:
bad entry in directory #9977857: rec_len %% 4 != 0 - offset=2268,
inode=1918858100, rec_len=28277, name_len=95
[

Version-Release number of selected component (if applicable):
2.4.20-18.9


How reproducible:
Every time we freshly buildup this filesystem, including building up the LVM
Volume Group freshly


Steps to Reproduce:
1. Create the volume Group with pvcreate /dev/sdc1
   vgcgreate -s 16M vg01 /dev/sdc1
2. Create the Logical Volumes:
   lvcreate -n etp -L 200G vg01
   lvcreate -n etp1 -L 200G vg01
3. Create the filesystems:
   mke2fs -j -L etp -R stride=16 /dev/vg01/etp
   tune2fs -c 0 -i 0 /dev/vg01/etp
   mke2fs -j -L etp1 -R stride=16 /dev/vg01/etp1
   tune2fs -c 0 -i 0 /dev/vg01/etp1
4. Mount them:
   mount /dev/vg01/etp /export/data/etp
   mount /dev/vg01/etp1 /export/date/etp1
5. Fill them with data (around 100 GBytes per FS)
   We tried rsync -e rsh from a another server
   We also tried rsync onto NFS mounted filesystem from another server
   We also tried Tivoli's dsmc command to restore a filesystem
6. Wait till 4:06 o'clock until slocate or something else runs, and you
   see the errors in the log.
    
Actual results:

Data Corruption!


Expected results:

The system should never corrupt data!


Additional info:

Dual PIII 1.4 Ghz System with a Tyan 2518 Motherboard
3 Ware 7500-8 Raid controller with 6 Disks á 160 GByte
(5 Disks Raid 5, one Disk hotspare).

We see no errors from the Raidcontroller, no disk read errors, no SCSI errors,
just the ext3 errors. Some googling through the net suspected me that this is an
problem in the 2.4.20 kernel (maybe already in 2.4.18) and/or backports from 2.5
which are included in 2.4.20-18 kernel.
Comment 1 Klaus Steinberger 2003-07-24 02:13:23 EDT
Created attachment 93096 [details]
Output from e2fsck after that, the filesystem is severly corrupted
Comment 2 Stephen Tweedie 2003-08-04 07:29:44 EDT
Can you reproduce this without using LVM?
Comment 3 Klaus Steinberger 2003-08-04 08:37:47 EDT
Yes, the error happens also without LVM.

We currently investigate into a hardware problem with the 3ware Controller.
We have already changed anything in this computer except the 3ware. We tried
also to connect two of the 160GB IDE disks directly to the IDE Ports on the
motherboad. we then created a Volume Group spanning this two disks, and created
a 200 GB logical volume with an ext3 filesystem. This works without an error. 

Currently we have exchanged the 3ware 7500 against an older 6000, and try again,
so please wait with further actions until we report on this.

Sincerely,
Klaus Steinberger
Comment 4 Klaus Steinberger 2003-08-05 01:53:49 EDT
We replaced now the 3ware 7500-8 controller through an old 3ware 6000 controller
until we get a replacement for the 7500, the problem disappeared. So I think it
is was really a faulty controller. Please excuse that I reported a Bug, but it
looked for me like a software problem, as we got no error messages from the
controller.

Sincerely,
Klaus Steinberger
Comment 5 Stephen Tweedie 2003-08-05 03:59:42 EDT
OK, thanks for following up on this.

Note You need to log in before you can comment on or make changes to this bug.