Bug 485571

Summary: Cannot read partition table correctly on 1394a-to-PATA hard disk enclosure
Product: [Fedora] Fedora Reporter: William M. Quarles <walrus>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 10CC: itamar, kernel-maint, quintela, stefan-r-rhbz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-06-29 18:11:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description William M. Quarles 2009-02-14 17:24:14 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.4) Gecko/2008111217 Fedora/3.0.4-1.fc10 Firefox/3.0.4

I have a Futura Mobile Storage Solution 3.5" External IDE (PATA) HDD Enclosure with USB 2.0 and Firewire support. I'm having problems using it with my Firewire connection in Ananconda, but not the USB 2.0 connection. I'm assuming this really has something to do with libraw1394, but considering that I don't really know what I am doing with mounting filesystems under SELinux, and the error message that I received was from Anaconda, I thought I would try Anaconda first and see what you thought.

When I boot my computer with the hard drive attached via USB 2.0, Anaconda detects the hard drive and it's partition table without any error messages. When using the 1394a connection however, I get a very different result.

"The partition table on device sdc (WDC WD40 0EB-11CPF0 31862) was unreadable.

"To create new partitions, it must be itialized, causing the loss of ALL DATA on this drive.

"This operation will override any previous installation choices about which drives to ignore.

"Would you like to itialize this drive erasing ALL DATA?"

If I select "No," I can continue installation, but it will keep reminding me of this issue. At some point the installation always locked up hard, but I have yet to determine which hardware device causes this. Memtest86 does not detect any errors in my RAM. The DVD-R verifies with no errors. When I remove my CardBus USB 2.0/1394a adapter and all devices connected to the computer (a Dell Latitude C640 laptop) with the exception of my docking station, mouse, and external monitor, the installation does not lock up.

Reproducible: Always

Steps to Reproduce:
1. Connect described external HDD with the IEEE-1394a connection
2. Power on computer with installation DVD-R inside.
3. Run installation and see results.
4. Repeat steps 1-3 with USB 2.0 connection instead.
Actual Results:  
The hard drive was detected to have an unreadable partition table on the IEEE-1394a connection, but not on the USB 2.0 connection. Error message was:

"The partition table on device sdc (WDC WD40 0EB-11CPF0 31862) was unreadable.

"To create new partitions, it must be itialized, causing the loss of ALL DATA on this drive.

"This operation will override any previous installation choices about which drives to ignore.

"Would you like to itialize this drive erasing ALL DATA?"

Expected Results:  
It works in Windows XP, so I expected the same in Anaconda.

Comment 1 Chris Lumens 2009-02-16 15:25:17 UTC
Please try again with F11 Beta, as we are currently right in the middle of a giant partitioning rewrite and it's very hard to say what the status of your bug will be once we're done.  Thanks.

Comment 2 William M. Quarles 2009-02-16 16:11:17 UTC
Did you mean Alpha or Beta? Just checking. Please reset the needinfo flag after verifying, so we can both keep track of this.

Comment 3 Chris Lumens 2009-02-16 16:16:53 UTC
Definitely the beta, which is not out yet.  The alpha doesn't really have much new in the way of partitioning, so your bug likely still exists there.

If you are really brave, you could also try Rawhide though it's going to be in rough shape for the next week or two.

Comment 4 Stefan Richter 2009-02-19 22:27:17 UTC
Could you obtain the output of "dmesg" after the failure and attach it here?

It could be bad firmware of the FireWire part in the enclosure.  Since you apparently have Windows at your disposal, have a look at the list of firmware updates at http://ieee1394.wiki.kernel.org/index.php/Firmware_Downloads .  Start with the Prolific updater from http://www.prolific.com.tw/ because the infamous Prolific chip is used in many FireWire 400 + USB 2.0 enclosures for IDE devices.

Comment 5 William M. Quarles 2009-05-28 00:46:03 UTC
Sorry I didn't get back to you sooner, but the Preview Release still shows the same basic behavior, the main difference being that the graphics look really screwed up on my system.

This device does use a Prolific chip, but I am unsure as to which one it is other than it is a PL-3507.  Apparently there are at least Chip versions B, C, and D, possibly others. If someone could tell me how to get the chip version I'd be willing to try a firmware update, but the updaters a particular to certain chip versions.

Todd Denistion replied to my thread on the fedora-user list entitled "Firewire HDD causes hard lockups," which started on 9 Feb 2009 7:57 PM EST, and said the following:

> There should probably still be a BZ if this fixes it, and I would have expected 
> the same problem using the USB connection, i.e., mention in the bug that same 
> drive with USB works fine, either they should both break or both work.
> 
> partition the drive with a Linux partitioner.
> format the partition NTFS using XP Pro as before, but be VERY careful to make 
> the format program use the PARTITION as opposed to the WHOLE DRIVE, which IIRC  
> MS defaults to whole drive."

I found a workaround, but considering that it still entails losing all of the data on the drive, I don't think it is satisfactory. In addition, my opinion at least if the drive works in Windows XP systems, then compatibility, practicality, and usability for the end-user demand that it work in Fedora as well, even if it is not as clean and as compliant as one likes. The fix I found was to partition the drive as one primary partition and leave the last 8MB of the drive out of the partition, then format again as NTFS.

In any case, Todd is right, they should either both break or both work.

Comment 6 Stefan Richter 2009-05-28 06:46:00 UTC
> I'd be willing to try a firmware update, but the updaters a particular
> to certain chip versions.

AFAIK the updater checks that and tells you what you got.  (It definitely does show which firmware version is presently installed, I think it also shows the chip revision.)

Furthermore, unlike repartitioning, firmware update does not cause data loss.

>> There should probably still be a BZ if this fixes it, and I would have
>> expected the same problem using the USB connection, i.e., mention in
>> the bug that same drive with USB works fine, either they should both
>> break or both work.

No, not at all.  The FireWire part and the USB part of PL3507 are entirely independent of each other, have different firmwares, and different bugs even at the highest level.  It is well-known that PL3507's USB part works quite well while its FireWire part works fair or badly or effectively not at all, depending on firmware version.  This is true on all OSs, not just Linux.

On Linux there is only the extra difficulty that it uses more comprehensive partition recognition requests than most Windows variants, and generally emits some requests (or rather: emits them in orders) which firmware authors never anticipated and tested.  (Vendors of devices in the mass market with low margins tend to ignore standards to which their firmwares and product brochures claim conformance with; instead they implement a fraction of it, test with one or two Windows variants, and ship it.)

> In addition, my opinion at least if the drive works in Windows XP systems,
> then compatibility, practicality, and usability for the end-user demand
> that it work in Fedora as well,

The Linux kernel developers do what they can to work around device bugs.  The add new workarounds for USB storage device bugs on an almost weekly basis.  The same could probably be done with your firmware if (a) it turns out that you cannot update firmware or already got current firmware and (b) you provide verbose SCSI command logs to linux-scsi.org and (c) somebody experienced on this area takes time to look at the logs and supplies you with kernel patches for testing.

PS:
One other thing you should investigate:  The device could also be upset by commands which userspace emits.  This could be helpers which are called by udev (can be suppressed by "killall udevd" before connecting the device --- not as a fix or workaround, but just for diagnosis) or maybe anaconda (I don't know how anaconda works).

PPS:
...unless you already repartitioned and can use the device now.

Comment 7 Andy Lindeberg 2009-06-08 18:09:03 UTC
It looks like this bug is a problem with the firewire driver, so I'm reassigning this to kernel.

Comment 8 William M. Quarles 2009-06-23 03:08:03 UTC
I'm not certain if it is the kernel or not. The problem was not as evident with the most recent version (kernel-2.6.27.24-170.2.68.fc10.i686) when running Fedora 10, however Fedora 11 Preview Release still had the "bug" in the installer. I would have tested further with the Preview Release, however I couldn't get the installer to work properly on my laptop regardless of the state of the connection to the firewire drive.

However, in other news, the Prolific firmware update did fix this issue for me. Programming a workaround into the kernel might improve compatibility but limit performance, so my best recommendation would be to make an addition to the Fedora release notes and to the kernel documentation to (if possible) update firmware for external devices, including specifically 1394 devices, prior to installation of the operating system, or if post-OS-installation, prior to making full use of the device.

Comment 9 Stefan Richter 2009-06-25 06:30:59 UTC
There shouldn't be a general recommendation to update firmwares, for a number of reasons.

PL3507 is a special case because it was often shipped with severely broken FireWire firmwares.  This chipset is known as the by far worst of all FireWire chipsets; what's not as well-known is which drives contain them.  PL3507 is a 1394a + USB 2.0 to IDE (PATA) adapter, hence the good news is that you'll not find it anymore in newer FireWire HDD enclosures which contain SATA drives.

But even with PL3507, firmware upgrade is not a particularly good recommendation because, for example, users need MS Windows to run the firmware updater.  PL3507 based products should be avoided in the first place, or returned for a replacement with proper chipset while in warranty, but this too is simpler said than done because vendors almost never provide information which chipset they use.

Comment 10 William M. Quarles 2009-06-25 10:06:33 UTC
Uh, that didn't come out right. It was late at night my time.... I should have specifically said the PL3507. And users don't need MS Windows, they just need *access* to MS Windows. This could be supported in Wine eventually, too, or a simple open-source alternative to the Prolific updater could be written, with the ROMs available for download from the Prolific site.

Anyway, I don't think that it is a good move to just tell users to avoid a particular chipset, particularly for the reasons that you already brought up as "simpler said than done;" I think that is the lazy and weak way to go about solving the problem.

Comment 11 Chuck Ebbert 2009-06-29 18:11:33 UTC
I've asked about getting a general note added to the release notes telling people to update their firmware and BIOS.