Bug 242254
Summary: | New firewire stack only recognizing half of a chain of drives | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | George Shearer <doc> | ||||
Component: | kernel | Assignee: | Kristian Høgsberg <krh> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 7 | CC: | cebbert, chris.brown, davej, jarod, stefan-r-rhbz | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-04-25 03:52:24 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
George Shearer
2007-06-02 18:56:31 UTC
This is a known bug (or missing feature) of the new fw-sbp2 driver. On devices with multiple logical units (in the same unit directory of the ROM), only one unit is used. Many if not all FireWire-IDE bridges which support two IDE devices on the same IDE channel are of this type. One cause that this is not working with fw-sbp2 (and didn't work with the old stack before Linux 2.6.12 either) is that the SBP-2 transport specification has additional provisions on top of IEEE 1212's generic scheme to represent multi-unit devices. PS, regarding the impact of this bug: The popular dual-disk bridges with RAID-1/-0/JBOD implemented in firmware are of course not affected. They only show one logical unit. Sadly, I must report that the status of this issue has not changed with the recent release of kernel 2.6.21-1.3194.fc7. Still a problem with 2.6.21-1.3228.fc7 :( I'm sure Kristian will add a notification here as soon as an update package with the necessary fix is available. If you need an intermediary solution for the short term, you need to install a kernel which has the older drivers ieee1394, ohci1394, and sbp2 enabled. PS: I am currently doing a little bit of work with the fw-sbp2 driver (mainly on weekends, and not in affiliation with Fedora or Red Hat) and will post a patch here when I got one. kernel-2.6.22.1-27.fc7 == :-( I'm in the middle of implementing it, but I'm currently slowed down by other work. I will post here when I'm through with it unless somebody else gets it done faster. Here is my first take: http://thread.gmane.org/gmane.linux.kernel.firewire.devel/10453 The patches are also temporarily available at http://me.in-berlin.de/~s5r6/linux1394/pending/. (In reply to comment #9) > Here is my first take: > http://thread.gmane.org/gmane.linux.kernel.firewire.devel/10453 > > The patches are also temporarily available at > http://me.in-berlin.de/~s5r6/linux1394/pending/. Thank you very much. I am eager to try the patches, unfortunately I am traveling at the moment. I will be home on Tuesday Jul 31st and will try them then. Thanks again! I fixed a few errors in my patches. You can get them for 2.6.23-rc1 or later from git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6.git or for 2.6.23-rc1, 2.6.22 and a few older kernels from http://me.in-berlin.de/~s5r6/linux1394/updates/. I don't know which patchset would be best to apply on top of the Fedora kernel sources. Created attachment 160351 [details]
bootlog of 2.6.22.1-33.local.fc7
This is the bootlog of the latest fedora released i686 kernel, with Stephan's
2.6.22 patch applied. Looks like it recognizes all drives. However, I see lots
of 'attempt to access beyond end of device' errors which doesn't sound good to
me.
You have one Seagate ST350063, correctly recognized as 500 GB disk, a few Maxtor 6 Y250P0 and Maxtor 6 B250R0, correctly recognized as 250 GB disks, and two Samsung SP2514N, which are incorrectly recognized as 32 GiB disks. Did you change the jumpers on the Samsung disks some time after you partitioned them? Many HDDs can be jumpered to pretend a 32 GiB limit for old BIOSes, and maybe you accidentally enabled that limitation. PS: Note these log lines: scsi 8:0:1:1: Direct-Access-RBC SAMSUNG SP2514N VF10 PQ: 0 ANSI: 4 sd 8:0:1:1: [sdh] 66055248 512-byte hardware sectors (33820 MB) ... sdh: sdh1 sdh: p1 exceeds device capacity Ugh! I should have looked closer. Your hunch was correct. The instructions on the samsung drives are a bit misleading. So I downloaded the PDF from the manufacturer's website, which clearly indicates the correct settings. The problem has been fixed. Humorously enough, they've always been set this way and I've never had a problem out of them. Looks like the new driver is much better at reporting such issues. As a side note, I unmounted all of my firewire drives and then extracted the two drives from the chassis to change the jumper. Unfortunately, the kernel panic'd in this process, which never happened with the old driver during hot swapping. In theory, the capacity should have been detected and checked the same with the old drivers. It happens at the SCSI level, transparent to the FireWire drivers. Does the hotswap mechanism detach the disks from IDE, or did the detachment happen on the FireWire side of the FireWire-IDE bridge board? PS, re capacity: Maybe something else in the FS, block IO, or SCSI code or configuration changed alongside with the FireWire drivers. The differences in the FireWire drivers, as far as I am aware of them, seem to me very unlikely to be related. PS, re panic: The pictures and manual from http://www.norcotek.com/ looks like the physical disconnection happens on the IDE side. A kernel panic message would be good to have to debug this. Alas the backplane in the enclosure looks like there is no other source of the bridge board(s) to get a test sample than Norco. afaik, this norco box uses oxford911 bridge chips, though it uses both the primary & slave portions which I've found to be unique. At any rate, it seems that this particular problem has been solved. Thank you for taking this on. All of my drives now work again, and it's nice to see all of them under the same scsi process. When I get time I'll attempt to reproduce the panic and capture the kernel output with serial console, and open a new ticket. I've got an old 911 and newer 922 and 912 based enclosures but they don't have hotpluggable IDE headers. It would probably not be a good idea to attempt IDE hotplugging with them... :-) Regarding the dual disk recognition, there are some refinements that Kristian suggested to me which I will implement sometime soon, and then the Fedora kernel maintainers have to merge the patch(es) in one of their next kernel package updates. I hope they will inform you here when they released that update. I added the patch and started a build: http://koji.fedoraproject.org/koji/taskinfo?taskID It will be available as kernel-2.6.23-0.140.rc3.git10.fc8 when it's done. You can download the build from that page or wait for tomorrows rawhide. Please give it a try and let us know if it works for you. Thanks (and to you too, Stefan) Kristian These fixes have not made it into a released F7 kernel yet.. Whats the process to make that happen? That patch does not apply to kernel 2.6.22. (In reply to comment #22) > That patch does not apply to kernel 2.6.22. I'm running a kernel I built myself using Stephan's patches.. and it's 2.6.22. See comment #12. No reason why this can't be included in an F7 official kernel release. This is a pretty major bug for those of us who rely on large Firewire arrays. The patch will make it into mainline (kernel.org's kernel) in 2.6.24-rc1. Even though it is more or less a bug fix, I decided against pushing the patch to Linus earlier because the patch has a huge line count and modifies core data structures of the firewire-sbp2 driver. Surely, distributors who switched to the new stack may consider to incorporate the patch into their 2.6.{22,23} based kernels. I am quite confident that the patch is correct and safe. (Famous last words.) If you guys think about taking it into an FC7 kernel, you could either wait a few days until Linus pulled all post-2.6.23 driver updates, then look at firewire-sbp2's history in Linus' tree to grab relevant patches on which the multi LU patch depends on. Or you could have a look at my personal site (see comment #11) to get a picture of the patch queue. (kernel.org's linux1394-2.6.git is somewhat messy at the moment and will change soon after Linus releases 2.6.23, therefore this git tree is not so well suited to pick backport candidates.) I recently updated to the latest F7 kernel 2.6.23.1-10.fc7.i686. All of my FW drives are recognized properly. I can read reliably from any firewire drive. I can write reliably to any firewire drive as well. However, if I attempt to do both simultaneously either to the same drive or different drives, a kernel panic will happen. :( Can you post the panic? Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the Fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage Are you still having this issue and if so could you attach the kernel panic as text/plain to this bug. If the problem no longer exists then please close this bug or I'll do so in a few days if there is no additional information lodged. Can the kernel panic still be reproduced with the latest kernel available in the updates repo? Note that maintenance for Fedora 7 will end 30 days after the GA of Fedora 9. The information we've requested above is required in order to review this problem report further and diagnose/fix the issue if it is still present. Since there have not been any updates to the report since thirty (30) days or more since we requested additional information, we're assuming the problem is either no longer present in the current Fedora release, or that there is no longer any interest in tracking the problem. Setting status to "CLOSED INSUFFICIENT_DATA". If you still experience this problem after updating to our latest Fedora release and can provide the information previously requested, please feel free to reopen the bug report. Thank you in advance. Note that maintenance for Fedora 7 will end 30 days after the GA of Fedora 9. |