Description of problem: In late 2003 there was a big stink about an update to Mac OS X 10.3 that had a bad interaction w/firewire drives with the Oxford 911 chipset. It seems that sometime in the recent past, the kernel has run into this same issue. When a drive with the old/buggy firmware is inserted, syslog logs: May 23 11:44:30 spx kernel: ieee1394: Error parsing configrom for node 0-00:1023 May 23 11:44:30 spx kernel: ieee1394: sbp2: Driver forced to serialize I/O (serialize_io=1) May 23 11:44:30 spx kernel: ieee1394: sbp2: Try serialize_io=0 for better performance May 23 11:44:30 spx kernel: scsi0 : SBP-2 IEEE-1394 May 23 11:44:31 spx kernel: ieee1394: sbp2: Logged into SBP-2 device May 23 11:44:31 spx kernel: Vendor: IC25N080 Model: ATMR04-0 Rev: May 23 11:44:31 spx kernel: Type: Direct-Access-RBC ANSI SCSI revision: 04 May 23 11:44:31 spx kernel: 0:0:0:0: Attached scsi generic sg0 type 14 May 23 11:44:31 spx kernel: SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) May 23 11:44:31 spx kernel: sda: Write Protect is off May 23 11:44:31 spx kernel: SCSI device sda: drive cache: write back May 23 11:44:31 spx kernel: sd 0:0:0:0: SCSI error: return code = 0x8000002 May 23 11:44:31 spx kernel: sda: Current: sense key: Aborted Command May 23 11:44:31 spx kernel: Additional sense: Logical block address out of range May 23 11:44:31 spx kernel: end_request: I/O error, dev sda, sector 156301480 May 23 11:44:31 spx kernel: Buffer I/O error on device sda, logical block 19537685 which repeats a dozen or so times. Using the disk will eventually result in: May 4 19:26:44 dhcp202 kernel: ieee1394: sbp2: aborting sbp2 command May 4 19:26:44 dhcp202 kernel: sd 0:0:0:0: May 4 19:26:44 dhcp202 kernel: command: Read (10): 28 00 03 30 9f c1 00 00 08 00 May 4 19:26:54 dhcp202 kernel: ieee1394: sbp2: aborting sbp2 command May 4 19:26:54 dhcp202 kernel: sd 0:0:0:0: May 4 19:26:54 dhcp202 kernel: command: Test Unit Ready: 00 00 00 00 00 00 May 4 19:26:54 dhcp202 kernel: ieee1394: sbp2: reset requested May 4 19:26:54 dhcp202 kernel: ieee1394: sbp2: Generating sbp2 fetch agent reset May 4 19:27:04 dhcp202 kernel: ieee1394: sbp2: aborting sbp2 command May 4 19:27:04 dhcp202 kernel: sd 0:0:0:0: May 4 19:27:04 dhcp202 kernel: command: Test Unit Ready: 00 00 00 00 00 00 May 4 19:27:04 dhcp202 kernel: sd 0:0:0:0: scsi: Device offlined - not ready after error recovery May 4 19:27:04 dhcp202 kernel: sd 0:0:0:0: SCSI error: return code = 0x50000 May 4 19:27:04 dhcp202 kernel: end_request: I/O error, dev sda, sector 53518273 May 4 19:27:04 dhcp202 kernel: printk: 50 messages suppressed. May 4 19:27:04 dhcp202 kernel: Buffer I/O error on device sda8, logical block 900856 May 4 19:27:04 dhcp202 kernel: sd 0:0:0:0: rejecting I/O to offline device May 4 19:27:04 dhcp202 kernel: Buffer I/O error on device sda8, logical block 900856 which goes on and on till device access stops. After the device firmware is upgraded, device is accessible with no problems. Given that this can cause data loss, if the bogus firmware version can be detected and blacklisted, that would probably be a good thing... Version-Release number of selected component (if applicable): unfortunately, the fc4 system I was using this drive on died and had to be re-installed. Me earliest log tells me that the issue was present in the 2069_FC4 kernel, and persists in the current kernel. I do know that the same drive, plugged in to a up-to-date RHEL i386 system does not exhibit this problem. How reproducible: Always. Steps to Reproduce: 1. connect firewire drive w/bad firmware 2. 3. Actual results: error message, errors accessing drive Expected results: warning message about firmware, maybe require some option to force mount. Additional info: Page on updating firmware, from Mac site: http://eshop.macsales.com/Reviews/Framework.cfm?page=/hardwareandnews/oxford/oxfordandpanther.html
As someone from Oxford Semiconductor kindly posted at linux1394-user, the way to properly detect chip and firmware of OxSemi based devices is to read at a certain offset from the configuration ROM. This can be done for example - with gscanbus: Click on the device icon to see its "Physical ID". Use the menu "Transactions/ Read Quadlet". In the dialogue, enter the ID as destination and 0xFFFFF0050000 as memory offset and hit OK. A result should appear in the third text box. - with 1394commander: Enter the command : i to get some basic information about the bus. Guess the disk's physical ID from it or from syslog. Enter the command : r . # 0xFFFFF0050000 4 with # replaced by the disk's physical ID (e.g. 0 if there are only two nodes and the local node has ID 1). A success message and 4 read bytes should appear. We would need the thereby obtained value from affected firmwares, and ideally also from unaffected firmwares to cross-check. Then it is possible to add some code to sbp2 to warn about these devices or perhaps even activate a workaround to avoid the "SCSI error... Logical block address out of range", in case there is such a workaround. I have one enclosure with OXFW911 which does not show the signs you described. It's magic number is 0x88000731. The last byte, 31, is firmware revision information, all other bytes denote the chip type OXFW911. But that said, I would rather like somebody wrote a Linux utility for firmware uploads than to add these workarounds to the kernel driver. That would of course require additional information from Oxford Semiconductor (and from any other SBP-2 bridge manufacturer whose chips we wanted to support). BTW, the problem with Oxford chips under OS X 10.3 was about OXUF922 (FireWire 800 bridge), not the OXFW911. Furthermore, the "sbp2: aborting sbp2 command" during later disk access may be unrelated to the initial "SCSI error... Logical block address out of range" and may be a driver bug instead of a firmware bug. There are conceptual problems in sbp2 which I hope to resolve eventually. (Don't hold your breath, I am already half a year behind my plans with sbp2 due to lack of time.)
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
(this is a mass-close to kernel bugs in NEEDINFO state) As indicated previously there has been no update on the progress of this bug therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue still occurs for you and I will try to assist in its resolution. Thank you for taking the time to report the initial bug. If you believe that this bug was closed in error, please feel free to reopen this bug.