350301 – ext3 filesystem corruption

Bug 350301 - ext3 filesystem corruption

Summary: ext3 filesystem corruption

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i686
OS:	Linux
Priority:	low
Severity:	urgent
Target Milestone:	---
Assignee:	Josef Bacik
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-10-24 11:41 UTC by Sergey Pachkov
Modified:	2008-02-26 21:26 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-02-26 21:26:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Sergey Pachkov 2007-10-24 11:41:00 UTC

Description of problem:
We use mysql with big files up to 130Gb and many small text files  (up to 10Kb,
10000 files). Under hi-load kernel report us about problems:
"
Oct  1 03:47:48 XXXX kernel: 08:11: rw=0, want=1357438156, limit=1144856128
Oct  1 03:47:48 XXXX kernel: attempt to access beyond end of device
Oct  1 03:47:48 XXXX kernel: 08:11: rw=0, want=1556136140, limit=1144856128
Oct  1 03:47:48 XXXX kernel: attempt to access beyond end of device
Oct  1 03:47:48 XXXX kernel: 08:11: rw=0, want=1422444748, limit=1144856128
Oct  1 03:47:48 XXXX kernel: attempt to access beyond end of device
Oct  1 03:47:48 XXXX kernel: 08:11: rw=0, want=1353769156, limit=1144856128
Oct  1 03:47:48 XXXX kernel: attempt to access beyond end of device
"
^^^
|||-for this we didn't call applications like 'dd'. it's very strange to have this.
and later we can see data corruption in our files.
and next time kernel printed lines:
"
Oct 19 04:03:04 hostdb kernel: EXT3-fs error (device sd(8,5)): ext3_readdir: bad
entry in directory #48110: rec_len is smaller than minimal - offset=0, inode=0,
rec_len=0, name_len=0
Oct 19 10:20:50 hostdb kernel: EXT3-fs error (device sd(8,5)): ext3_readdir: bad
entry in directory #48110: rec_len is smaller than minimal - offset=0, inode=0,
rec_len=0, name_len=0
Oct 19 11:18:03 hostdb kernel: GDT: Unknown SCSI command 0x4d to cache service !
Oct 19 11:18:03 hostdb last message repeated 4 times
Oct 19 11:18:33 hostdb kernel: GDT: Unknown SCSI command 0x4d to cache service !
Oct 19 11:18:33 hostdb last message repeated 4 times
Oct 19 11:23:55 hostdb kernel: GDT: Unknown SCSI command 0x4d to cache service !
Oct 19 11:23:55 hostdb last message repeated 4 times
Oct 19 11:32:16 hostdb kernel: EXT3-fs error (device sd(8,5)): ext3_readdir: bad
entry in directory #48110: rec_len is smaller than minimal - offset=0, inode=0,
rec_len=0, name_len=0
"

We don't know how to reproduce this problem but we have this twice in last year.
Usually we can't recover our storage and just make fresh ext3 filesystem.

Version-Release number of selected component (if applicable):
Red Hat Enteripse Linux 3 (Update 3)
kernel-smp-2.4.21-20.EL.i686
qlogic 7.07.06

How reproducible:
n/a

Steps to Reproduce:
1. n/a
2.
3.
  
Actual results:
data corruption
filesystem corruption

Expected results:
normal work system without internal filesystem problems and data loss problems.

Additional info:

Our system based on 4 CPU Intel Xeon 2.4G and 2Gb RAM
as data storage we use RAID on qlogic 2x00 Fiber Controller and 
system installed on RAID:
GDT: Storage RAID Controller Driver. Version: 2.05
GDT: Found 1 PCI Storage RAID Controllers
GDT CTR0: Configuring GDT-PCI HA at 4/8 IRQ 48
scsi0 : SRCZCR
  Vendor: Intel     Model: Host Drive  #00   Rev:
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: ESG-TSD   Model: SCA HSBP M23      Rev: 1.05
  Type:   Processor                          ANSI SCSI revision: 02

Comment 1 Sergey Pachkov 2007-10-31 08:23:01 UTC

additional info:
major minor  #blocks  name     rio rmerge rsect ruse wio wmerge wsect wuse 
running use aveq
   8     0  143331930 sda 1872107 19531040 171139390 14335430 4663839 35925827 
324919600 3514019 0 20869670 17958709
   8     1     152586 sda1 538 9401 19878 1950 77 67 288 4650 0 4430 6600
   8     2    5116702 sda2 1163371 17126186 146315778 9829230 896466 22965737 
191014184 28593164 0 6263820 38430804
   8     3    4610655 sda3 128015 294559 3380162 720500 286994 232944 4159568 
5437350 0 2162110 6157730
   8     4          1 sda4 3 0 6 10 0 0 0 0 0 10 10
   8     5    4096543 sda5 17919 59512 619066 205500 81609 97295 1431240 
2240840 0 2020740 2446320
   8     6    3582463 sda6 41009 164807 1645850 1066920 191747 667659 6921520 
7832120 0 2570280 8900060
   8     7    3582463 sda7 14912 63858 629482 219230 686481 1076459 14107176 
24504980 0 5006930 24724140
   8     8    2096451 sda8 287339 125 2299424 1144290 283704 291429 4614664 
42280490 0 9861480 568347
   8     9    2048256 sda9 36446 36630 584642 186280 446021 1683332 17036272 
8172990 0 5074990 8359200
   8    10    1534176 sda10 65680 121989 1500882 334020 404936 991110 11170984 
6968230 0 5252400 7302190
   8    11    4891761 sda11 5269 37066 338002 35130 655381 511099 9335608 
39476730 0 2063580 39511880
   8    12    9775521 sda12 110836 1614095 13798770 589460 730423 7408696 
65128096 9801507 0 6681620 10393727
   8    16 1144860672 sdb 7436866 77894686 642758068 25197464 13030831 
79000796 628373272 36912771 0 24654780 19191062
   8    17 1144856128 sdb1 7436852 77894652 642757972 25198124 13030831 
79000796 628373272 36913051 0 24654810 19195572
1144856128

Our server use fixed devices and we didn't change devices on runnig system.
I don't know why our external RAID moved from 8:17 -> 8:11
May be something 'shit' revalidate partitions?
But how???

Comment 2 Eric Sandeen 2007-11-30 05:37:26 UTC

Did you fsck the filesystem?  What did it find?

Comment 3 Sergey Pachkov 2007-11-30 07:16:20 UTC

e2fsck -fn /dev/sdb1
e2fsck 1.32 (09-Nov-2002)
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Inode 32774, i_size is 136314880, should be 144703488.  Fix? no
Inode 32774, i_blocks is 266512, should be 282912.  Fix? no
Inode 37218 has illegal block(s).  Clear? no

Now I can't reproduce fsck output because we did fresh filesystem (on 22 
October)

tune2fs  -l /dev/sdb1
tune2fs 1.32 (09-Nov-2002)
Filesystem volume name:   MAIN_ARCHIVE
Last mounted on:          <not available>
Filesystem UUID:          36e50578-61fa-43c3-a4ff-fd77497a90e9
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal filetype needs_recovery sparse_super 
large_file
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              143114240
Block count:              286214032
Reserved block count:     14310701
Free blocks:              264838496
Free inodes:              143035334
First block:              0
Block size:               4096
Fragment size:            4096
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16384
Inode blocks per group:   512
Filesystem created:       Mon Oct 22 05:12:03 2007
Last mount time:          Mon Oct 22 08:21:04 2007
Last write time:          Mon Oct 22 08:21:04 2007
Mount count:              2
Maximum mount count:      20
Last checked:             Mon Oct 22 05:12:03 2007
Check interval:           864000 (1 week, 3 days)
Next check after:         Thu Nov  1 04:12:03 2007
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal UUID:             <none>
Journal inode:            8
Journal device:           0x0000
First orphan inode:       0

Comment 4 Josef Bacik 2008-02-04 19:58:52 UTC

usually these messages have to do with the fs thinking its bigger than the underlying disk 
and trying to write to a spot outside of the disk.  Have you figured out a way to reproduce 
this on a regular basis?  Have you reproduced on the U9 kernel?

Comment 5 Sergey Pachkov 2008-02-05 06:59:01 UTC

No, when we switched to U9 kernel then we don't see any problems with ext3.

Comment 6 Josef Bacik 2008-02-26 21:26:31 UTC

Ok I'm going to close this out.  Feel free to re-open it if you run into any more issues.

Note You need to log in before you can comment on or make changes to this bug.