Bug 27614 - File system errors
File system errors
Status: CLOSED RAWHIDE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Stephen Tweedie
Aaron Brown
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-02-14 09:18 EST by Michael Young
Modified: 2007-04-18 12:31 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-04-06 08:14:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Michael Young 2001-02-14 09:18:46 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (X11; U; SunOS 5.6 sun4u)

This may be an issue with the ext2 filesystem itself rather than e2fsprogs,
but this
is the nearest appropriate component I could find.
I have been having files go missing for a couple of days (which I initially
blamed on tmpwatch in bug 27145). However today some ext2fs errors appeared
on the
console. When I rebooted the system I had to run fsck manually which
reported a lot of errors. Have I been unlucky to hit a bad block, or are
there underlying problems with the filesystem software? (I don't remember
any problems before I upgraded to Fisher). 

Reproducible: Didn't try

Here are some extracts from /var/adm/messages
Feb 14 09:23:33 itspc116 kernel: attempt to access beyond end of device
Feb 14 09:23:33 itspc116 kernel: 03:03: rw=0, want=8388820, limit=901152
Feb 14 09:23:33 itspc116 kernel: EXT2-fs error (device ide0(3,3)):
ext2_readdir:
 directory #54593 contains a hole at offset 0
Feb 14 09:23:33 itspc116 kernel: attempt to access beyond end of device
Feb 14 09:23:33 itspc116 kernel: 03:03: rw=0, want=10485788, limit=901152
Feb 14 09:23:33 itspc116 kernel: EXT2-fs error (device ide0(3,3)):
ext2_readdir:
 directory #54593 contains a hole at offset 4096
Feb 14 09:23:33 itspc116 kernel: attempt to access beyond end of device
Feb 14 09:23:33 itspc116 kernel: 03:03: rw=0, want=6291560, limit=901152
Feb 14 09:23:33 itspc116 kernel: EXT2-fs error (device ide0(3,3)):
ext2_readdir:
 directory #54593 contains a hole at offset 8192
Feb 14 09:23:33 itspc116 kernel: attempt to access beyond end of device
Feb 14 09:23:33 itspc116 kernel: 03:03: rw=2, want=538050772, limit=901152
Feb 14 09:23:33 itspc116 kernel: EXT2-fs error (device ide0(3,3)):
ext2_readdir:
 bad entry in directory #54593: rec_len %% 4 != 0 - offset=0, inode=33188,
rec_l
en=831, name_len=0
Comment 1 Glen Foster 2001-02-15 20:12:01 EST
We (Red Hat) should really try to resolve this before next release.
Comment 2 Michael Young 2001-02-16 10:56:49 EST
I have found further disk corruption, having previously successfully run fsck -f
(single user) without errors. fsck -nf now tells me thing like
Inode 55105 has illegal block(s).
Illegal block #0 (4041469680) in inode 55105.  IGNORED.
and
Error while iterating over blocks in inode 55105: Illegal triply indirect block
found

Also I have spotted errors such as the following occuring in the
/var/log/messages file, which didn't occur when I was running RH6.2 .

Feb 16 09:21:05 itspc116 kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Feb 16 09:21:06 itspc116 kernel: hda: drive not ready for command


Comment 3 Florian La Roche 2001-02-21 09:27:44 EST
What ide controller, mainboard and disks are you using. What is the exact
version of the
kernel that is running on this system?

This seems to be a kernel problem with resulting disk corruption. I'll reassign
this to the kernel
rpm, but will watch further info about it.
Comment 4 Michael Young 2001-02-21 09:55:45 EST
I am not an expert on hardware so I hope this makes sense
Motherboard: ATX (pentium 166) "RM Advanced/ML Pentium Systemboard"
IDE controller: PIIX3 "82371SB PCI ISA/IDE Xcelerator"
Hard Disk: ST32132A
Kernel: 2.4.0-0.99.11
Comment 5 Florian La Roche 2001-02-21 11:29:45 EST
Can you please try newer kernels from
ftp://ftp.redhat.com/pub/rawhide/i386/RedHat/RPMS/
or from ftp://ftp.redhat.com/pub/redhat/beta/wolverine/i386/RedHat/RPMS/ to
check if
newer kernels have this already fixed?
Comment 6 Michael Young 2001-02-22 07:24:01 EST
I have upgraded the kernel to that in wolverine (2.4.1-0.1.9). The "drive not
ready" messages are still there (if they are in fact related to the disk
corruption), eg.
Feb 22 11:36:25 itspc116 kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Feb 22 11:36:25 itspc116 kernel: hda: drive not ready for command
but it may be several days before the disk corruption reappears, if the upgrade
hasn't fixed it.
Comment 7 Michael Young 2001-02-23 06:02:43 EST
I have some more evidence that suggests the problem is still there. I upgraded
the XFree packages to wolverine, and afterwards fsck reports some block bitmap
differences, even when the file system is mounted read-only, eg.
Pass 5: Checking group summary information
Block bitmap differences:  -186349 -186350 -186351 -186352 -186353 -186354
-186355 -187286 -187287 -187288 -187289 -187290 -187291 -187292 -195118 -195119
-195120 -195121 -195122 -195123 -195124 -195203 -195204 -195205 -195206 -195207
-195208 -195209
Comment 8 Michael Young 2001-02-28 10:40:57 EST
I had some more file corruption yesterday. I logged the fsck session to clean it
(from a second partition) if this information would be useful.
Comment 9 Michael K. Johnson 2001-02-28 22:55:33 EST
Can you try 2.4.1-0.1.14 from rawhide?

If that doesn't fix it, the next rawhide we put out will have a
"nodma" option that will make it easier to debug this.
Comment 10 Michael Young 2001-03-02 12:44:27 EST
I am on 2.4.1-0.1.14 now, I haven't seen any file corruption yet, (though the
system has hung, requiring a reset), but entries in /var/log/messages look
suspicious, for example

Mar  1 14:01:19 itspc116 kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Mar  1 14:01:19 itspc116 kernel: hda: drive not ready for command
Mar  1 14:01:19 itspc116 kernel: attempt to access beyond end of device
Mar  1 14:01:19 itspc116 kernel: 03:03: rw=0, want=790435384, limit=901152
Comment 11 Michael Young 2001-03-08 07:27:42 EST
I have had some more file/directory corruption with the 2.4.2-0.1.19 kernel. I
have a log of the fsck session afterwards (logged to a separate partition) and
messages in /var/log/messages if any of this is useful.
Comment 12 Michael Young 2001-04-06 08:14:46 EDT
I have had more corruption with 2.4.2-0.1.28 (while I was upgrading to
2.4.2-0.1.49). Again I have more details if you want them.
Comment 13 Arjan van de Ven 2001-04-07 16:19:37 EDT
{ DriveReady SeekComplete DataRequest } is usually an indication that
your cables are outside of the allowed limits. This usually shows up only
when using the higher DMA modes which our kernel now does for a while. 
If you don't want to change cables, you can always boot with "ide=nodma" on the
commandline of the kernel (eg on the lilo prompt)....

Please test this and reopen the bug if this doesn't help.
Comment 14 Christopher Johnson 2001-05-17 09:25:53 EDT
I encountered this problem, and the ide=nodma boot option avoids it, but I'm
doubtful of a cable problem since this system is a laptop.

Error indications in syslog were:
May 15 10:14:07 cjohnsonPC kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
May 15 10:14:07 cjohnsonPC kernel: hda: drive not ready for command

The HW/SW information in syslog were:
May 15 10:11:52 cjohnsonPC kernel: Uniform Multi-Platform E-IDE driver Revision:
6.31
May 15 10:11:52 cjohnsonPC kernel: ide: Assuming 33MHz system bus speed for PIO
modes; override with idebus=xx
May 15 10:11:52 cjohnsonPC kernel: PIIX4: IDE controller on PCI bus 00 dev 39
May 15 10:11:52 cjohnsonPC kernel:     ide0: BM-DMA at 0x38a0-0x38a7, BIOS
settings: hda:DMA, hdb:DMA
May 15 10:11:52 cjohnsonPC kernel:     ide1: BM-DMA at 0x38a8-0x38af, BIOS
settings: hdc:pio, hdd:pio
May 15 10:11:52 cjohnsonPC kernel: hda: IBM-DARA-212000, ATA DISK drive
May 15 10:11:52 cjohnsonPC kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
May 15 10:11:52 cjohnsonPC kernel: hda: 23579136 sectors (12073 MB) w/418KiB
Cache, CHS=1559/240/63, UDMA(33)
May 15 10:11:52 cjohnsonPC kernel:  hda: hda1 hda2 hda3 < hda5 hda6 hda7 >

I am running kernel-2.4.2-2 i686 straight out of RH 7.1.
The laptop is a Compaq Armada M700.

If someone at RedHat or Compaq would persue this issue I would gladly provide
any needed info.

Note You need to log in before you can comment on or make changes to this bug.