From Bugzilla Helper: User-Agent: Mozilla/4.78 [en] (X11; U; Linux 2.4.9-31enterprise i686) Description of problem: This is a difficult bug. The machine is a DELL Latitude C800 with the following hardware connected to the ieee1394 port: - a LaCie 30 GB hard drive - a Fujitsu DynaMO 1300FE MagnetoOptical drive (chained to the LaCie disk) I successfully ran the LaCie disk for months under RH7.2. I am able to run the MO disk alone apparently with no problem (to be verified, I have not stressed too much this configuration). As soon as I put both in operation I got hard machine crashes under RH7.2 (kernel-2.4.9-31) roughly 1 boot out of 2. The crashes occurred when inserting iee1394->ohci1394->sbp2 and make the machine reboot. I moved to skipjack2 (actually the kernel, glibc, initscrpits, hotplug, mount and friends, I cannot afford to jeopardize too much the machine). The situation improved, I am able to use the MO drive and the hard disk together even for hours and hundreds of MBytes of transfers... however almost every day at some point the kernel stops to recognize correctly the MO disks (they are blocked at 2048 bytes) and starts to complain that all operations on block 0 (the partition table) results in short read/write. There is no way to recover the disk apparently.... however mounting it on superblock 16384 it is fine and perfectly ok, e2fsck'ing it using block 16384 is ok, but for a systematic error message that block 0 cannot be restored. I was thinking about disk hardware failures maybe triggered by the MO driver, however: - the same disk inserted into an identical driver (but SCSI not Firewire) is ok, but the partition table which is corrupted, e2fscking from superblock 16384 restores it perfectly provided I dd if=/dev/null the 1st block of the disk before (this operation does not work on the Firewire driver resulting in an I/O error, short read/write). At this point the disk works again flawlessly into the original firewire drive, until some voodoo triggers again the partition table corruption. This occurs for both 640 MBytes and 1300 MBytes disks, both with ext2 or ext3 partitions, both with real partitions, or with "superfloppy" format. Needless to say the drive works flawlessly under Windows ME exactly in the same configuration. I have the feeling that the coexistence of the two drives sometimes triggers a situation where the kernel "forgets" that the drive has a 2048 blockage. The problems occurs typically a) during scsi bus rescans (but not always) b) during iee1394 resets (ie removing and inserting sbp2) sometimes I get kernel oops also in this condition c) when the hard disk is mounted d) after several hours of inactivity e) when the MO disk is unmounted never during normal operations (read/write of files) None of the above situations is 100% reproducible, however in 10 days, using that machine only in the evening, I got at least 10 hard hangs and more than 20 partition tables spoiled. I am happy to help carrying out some further tests if you have any good suggestions. Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1. get both a hard disk and a MO firewire drives 2. make some gymnastics with them 3. the MO drive partition table gets corrupted Actual Results: The MO partition table gets corrupted randomly Expected Results: no problem Additional info:
... I got no feedback. I want just to confirm that the problem is still there in RH7.3 with kernel-2.4.18-4... and even worse (it systematically wipes out the .journal inode on my ext3 partitions).
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/