Bug 240708

Summary: FC6 is eating my hard drives
Product: [Fedora] Fedora Reporter: Joseph A. Farmer <jfarmer99>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-10 19:05:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Screenshot of drive doom. none

Description Joseph A. Farmer 2007-05-21 00:34:14 UTC
Description of problem:  Fedora Core 6 eats my disks.


Version-Release number of selected component (if applicable): 6


How reproducible:  It's managed 2 so far and 2 more are looking grim.


Steps to Reproduce:
1. Use Fedora 6.  Lose disk.
2.  Rinse.
3.  Repeat.
  
Actual results:
Disks physically dead.

Expected results:
Disks shouldn't resemble the Parrot of legend.

Additional info:
I'm going to try very hard to not be frustrated and rant.  I have 3 computers
that I use myself.  The newest is a Dell SC400 and the other 2 are a bit older.
 Installing FC6 failed on all 3 when it was released.  Something to do with the
DVD drives as it would start install and then, when it was time for FC6 CD2, it
wouldn't be able to find the DVD drive.  I ran a series of CDRW drives through
the machines to find the magical one and did find that a CD drive without write
capability seemed to deal with that.  There was a catch though.  As part of that
process a 60GB and an 80GB drive gave up the ghost.  Physically gone.  I don't
think it really had a lot to do with the DVD/CD swaps thought as it is PATA and
on the other channel.  Neither drive will take a format.  Such is life, the data
is gone forever and some was kind of valuable to me but that's that.  Both the
60 and 80 were boot drives but the slaves on the primary IDE interface were
fine.  The DVD/CDs were primary on the second channel.  

So after getting FC6 installed, and losing all my data, things have been kind of
ok.  Today I finally rebooted the SC400.  Another drive gone.  Perhaps 2.  That
would make 3/4 that FC6 has munched.  RH Linux 7.X - 9 didn't eat any.  FC1-5
didn't either.  FC6 seems a bit drive hungry.  Drives themselves are pretty
inexpensive so that's not a huge deal but darned if I like losing the data on
them.  At this point I'm starting to think the only way forward with Fedora is
on RAID hardware.

Now the question is: the SC400 has 3 hard drives and 1 DVD.  Attached screen
shot shows EXT3 thinks the drives are directories but I'm not clear what that
even means.  Are these drives now gone with the wind also?  Are the 250 and 300
both pining for the fjords?

The 250 is master on the second IDE channel.  The 300 is slave on the first so
they're on separate channels.  The odds that they failed at the same time is
unlikely indeed.  

FC6/EXT3 seems to be eating my disks.

How do I bring them back if they're not physically destroyed?  The boot drive,
the oldest of them by the way, is fine.  PATA all.  I don't think the 300 is
even a year old and the 250 isn't much older.  They were both fine this morning
until I decided to reboot for cleaning reasons (area around computer, computer
was untouched).

The computer can sit powered off until we see if somebody can look at this.  I
fear for those 2 drives.

Comment 1 Joseph A. Farmer 2007-05-21 00:34:14 UTC
Created attachment 155063 [details]
Screenshot of drive doom.

Comment 2 Joseph A. Farmer 2007-05-26 03:35:10 UTC
Ok, I dug into this deeper and it just got stranger on me as I went.  The first
problem was a stale mtab was confusing the system.  That wasn't the real problem
though.

When the machine booted it failed on the boot screen as that screenshot showed.
 Running fsck on /dev/hdb1 and /dev/hdc1 returned "bad superblock, not ext3" on
both.  I ran smartctl and it claimed both drives were fine.  I then ran fdisk
and both partitions looked fine.  Next up was fsck again and it persisted with
the bad superblock thing.  Running e2fsck direct with -p returned "drives clean,
x number of files" so e2fsck was reading them fine while fsck on its own wasn't. 

I tried to edit fstab to delete the references to them but the root partition
was mounted readonly and umount didn't help there.  Booting the FC6 disk one
resulted in hangs on the "loading ata_piix" screen (as happened originally on
all 3 machines).  I tracked down an FC5 test 3 CD and booted that into rescue
mode and deleted the references.  That is where it sits now.  With a twist.

If I put a reference to the drives into fstab it fails to boot with the "no
superblock, not ext3" screen.  If I run e2fsck it reports the drives are fine. 
If I wait until after boot and just mount manually:
mount /dev/hdb1 /mnt/300
mount /dev/hdc1 /mnt/250
both drives mount fine.  All files are there.  It just won't boot with mount
instructions in fstab.

Odd and frustrating.

Why fsck reports that the disks aren't ext3, while e2fsck -p has no issues, is a
mystery and a pretty serious bug in fsck.  I was about ready to reformat the
drives before I discovered the e2fsck -p solution.  e2fsck had a -v (verbose)
option that is anything but and that's a bug also - the drive takes a long time
to check and no text is reported until that completes so what part of that is
"verbose" is not clear.

So consider it a bug against fsck.  Also one against stale mtab files.  

Current mtab:
/dev/hda1 / ext3 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
/dev/hdb1 /mnt/300 ext3 rw 0 0
/dev/hdc1 /mnt/250 ext3 rw 0 0

Note both drives are mounted ext3.  I mounted them manually as stated.

Comment 3 Chuck Ebbert 2007-05-30 20:21:48 UTC
Please post contents of /etc/fstab

Comment 4 Joseph A. Farmer 2007-11-10 19:05:12 UTC
Booting a new Fedora 7 CD and selecting "install" instead of upgrade cleared up
the issue.  Machine operates as expected now.

fsck is probably still messed up.