Bug 235199

Summary: firewire scanner problems with 2.6.21.x kernels
Product: [Fedora] Fedora Reporter: Joseph Sacco <jsacco>
Component: kernelAssignee: Kristian Høgsberg <krh>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: ak, cweyl, dsyrstad, fenlason, jarod
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-08-27 14:20:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joseph Sacco 2007-04-04 14:04:51 UTC
An Epson 2450 Perfection scanner with both USB and firewire connections
works well using either type of connection when running FC6 [2.6.20.4
kernel].

When running fedora/rawhide [2.6.21.x kernel] the scanner works well
using the USB connection but has problems when using the firewire
connection.

With firewire and the 2.6.21.x kernel:
* scanner is recognized
* scan starts and then hangs

/var/log/messages indicate that connectivity/registration of the
firewire device has been lost:

Apr  4 09:43:45  plantain kernel: fw_core: phy config: card 1, new root=ffc1,
gap_count=5
Apr  4 09:43:45  plantain kernel: fw_core: created new fw device fw2 (0 config
rom retries)
Apr  4 09:43:45  plantain kernel: fw_sbp2: logged in to sbp2 unit fw2.0 (0 retries)
Apr  4 09:43:45  plantain kernel: fw_sbp2:  - management_agent_address:   
0xfffff0010020
Apr  4 09:43:45  plantain kernel: fw_sbp2:  - command_block_agent_address:
0xfffff0010100
Apr  4 09:43:46  plantain kernel: fw_sbp2:  - status write address:       
0x000100000000
Apr  4 09:43:46  plantain kernel: scsi2 : SBP-2 IEEE-1394
Apr  4 09:43:46  plantain kernel: scsi 2:0:0:0: Processor         EPSON   
GT-9700          1.05 PQ: 0 ANSI: 4
Apr  4 09:43:46  plantain kernel: scsi 2:0:0:0: Attached scsi generic sg3 type 3
Apr  4 09:44:10  plantain kernel: ppdev: user-space parallel port driver
Apr  4 09:46:27  plantain kernel: fw_sbp2: sbp2_scsi_abort
Apr  4 09:46:37  plantain kernel: fw_sbp2: sbp2_scsi_abort
Apr  4 09:46:37  plantain kernel: scsi 2:0:0:0: scsi: Device offlined - not
ready after error recovery
Apr  4 09:46:38  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline device



-Joseph

Comment 1 Kristian Høgsberg 2007-04-04 15:31:06 UTC
What is the exact version and release of the kernel you're using?

Comment 2 Joseph Sacco 2007-04-04 15:50:41 UTC
Kristian,

The trace above was generated while running 

   2.6.20-1.3040.fc7smp

on a dual G4 PowerMac.

-Joseph

Comment 3 Kristian Høgsberg 2007-04-06 23:01:20 UTC
I fixed a critical error that would cause data corruption typically under heavy
throughput, the fix is in kernel-2.6.20-1.3051.fc7.  If you could give that a
try and see if it fixed the problem, that'd be great.

Comment 4 Joseph Sacco 2007-04-07 15:46:54 UTC
I do not [yet] see kernel-2.6.20-1.3051.fc7 in

http://download.fedora.redhat.com/pub/fedora/linux/core/development/ppc/os/Fedora/

Is there another URL ?

-Joseph

Comment 5 Joseph Sacco 2007-04-10 14:00:32 UTC
The problem persists with kernel-2.6.20-1.3054.fc7smp [see below].

-Joseph

==========================================================================

Apr 10 09:51:15  plantain kernel: scsi2 : SBP-2 IEEE-1394 Apr 10 09:51:15 
plantain kernel: scsi 2:0:0:0: Processor         EPSON    GT-9700          1.05
PQ: 0 ANSI: 4
Apr 10 09:51:15  plantain kernel: scsi 2:0:0:0: Attached scsi generic sg3 type 3
Apr 10 09:52:39  plantain kernel: ppdev: user-space parallel port driverApr 10
09:53:01  plantain ntpd[1970]: synchronized to 18.7.21.144, stratum 2
Apr 10 09:53:01  plantain ntpd[1970]: time reset -0.343515 s
Apr 10 09:53:02  plantain ntpd[1970]: kernel time sync enabled 0001
Apr 10 09:55:22  plantain kernel: fw_sbp2: sbp2_scsi_abortApr 10 09:55:32 
plantain kernel: fw_sbp2: sbp2_scsi_abort
Apr 10 09:55:32  plantain kernel: scsi 2:0:0:0: scsi: Device offlined - not
ready after error recovery 
Apr 10 09:55:32  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline device
Apr 10 09:55:33  plantain last message repeated 5693 times
Apr 10 09:55:33  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline scsi
2:0:0:0: rejecting I/O to offline device
Apr 10 09:55:33  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline device
Apr 10 09:55:33  plantain last message repeated 81 times
Apr 10 09:55:33  plantain kernel: scsi 2:0:0:0: rejecting I/O scsi 2:0:0:0:
rejecting I/O to offline devscsi 2:0:0:0: rejecting I/O to offline device
Apr 10 09:55:33  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline device
Apr 10 09:55:33  plantain last message repeated 80 times
Apr 10 09:55:33  plantain kernel: scsi 2:0:0:0: rejecting I/O to offlscsi
2:0:0:0: rejecting I/O to offline device

                              ....





Comment 6 Joseph Sacco 2007-04-11 16:42:29 UTC
The problem persists with the 2.6.20-1.3056 kernel [see below]

-Joseph

===========================================================================

Apr 11 12:36:50  plantain kernel: fw_core: phy config: card 1, new root=ffc1,
gap_count=5
Apr 11 12:36:50  plantain kernel: fw_core: created new fw device fw2 (0 config
rom retries)
Apr 11 12:36:51  plantain kernel: fw_sbp2: logged in to sbp2 unit fw2.0 (0 retries)
Apr 11 12:36:51  plantain kernel: fw_sbp2:  - management_agent_address:   
0xfffff0010020
Apr 11 12:36:51  plantain kernel: fw_sbp2:  - command_block_agent_address:
0xfffff0010100
Apr 11 12:36:51  plantain kernel: fw_sbp2:  - status write address:       
0x000100000000
Apr 11 12:36:51  plantain kernel: scsi2 : SBP-2 IEEE-1394
Apr 11 12:36:51  plantain kernel: scsi 2:0:0:0: Processor         EPSON   
GT-9700          1.05 PQ: 0 ANSI: 4
Apr 11 12:36:51  plantain kernel: scsi 2:0:0:0: Attached scsi generic sg3 type 3
Apr 11 12:37:05  plantain kernel: ppdev: user-space parallel port driver
Apr 11 12:37:32  plantain ntpd[1985]: synchronized to 18.7.21.144, stratum 2
Apr 11 12:37:32  plantain ntpd[1985]: kernel time sync enabled 0001
Apr 11 12:39:48  plantain kernel: fw_sbp2: sbp2_scsi_abort
Apr 11 12:39:58  plantain kernel: fw_sbp2: sbp2_scsi_abort
Apr 11 12:39:58  plantain kernel: scsi 2:0:0:0: scsi: Device offlined - not
ready after error recovery
Apr 11 12:39:59  plantain kernel: scsi 2:0:0:0: rejecting I/O to offline device


Comment 7 Joseph Sacco 2007-04-12 15:22:40 UTC
The problem persists with the 2.6.20-1.3059 kernel.

-Joseph

Comment 8 Joseph Sacco 2007-04-13 15:45:41 UTC
The problem persists with the 2.6.20-1.3062 kernel.

-Joseph

Comment 9 Joseph Sacco 2007-04-17 19:14:27 UTC
The problem persists with the 2.6.20-1.3079 kernel.

-Joseph

Comment 10 Joseph Sacco 2007-04-20 00:32:49 UTC
The problem persists with the 2.6.20-1.3088 kernel.

-Joseph

Comment 11 Alex Kanavin 2007-04-21 13:53:10 UTC
I'm experiencing a similar problem with an external hard drive: bug #231708. If this is not resolved in test 
4 due next week, dare I suggest this new fireware stack should be omitted from Fedora 7?

Comment 12 Joseph Sacco 2007-04-27 17:33:49 UTC
The problem persists with the 2.6.21-1.3116.fc7smp kernel.

-Joseph

Comment 13 Joseph Sacco 2007-05-15 16:02:47 UTC
The problem persists with the 2.6.21-1.3142.fc7smp kernel.

-Joseph

Comment 14 Kristian Høgsberg 2007-05-23 02:10:10 UTC
I have a fix for #238606 currently building, and my guess is that it's the same
bug you're seeing with the scanner.  Comment #9 in bug #238606 has a link to the
build system tracking page for the kernel (.3190) with the fix.  Thanks for all
the testing up until now!

Comment 15 Joseph Sacco 2007-05-26 15:55:31 UTC
Confirmed... Problem has gone away when using the 2.6.21-1.3194.fc7smp kernel,

Thanks,


-Joseph

Comment 16 Dan Syrstad 2007-06-04 13:48:46 UTC
I'm still having a problem, similar to comment #11, with my Maxtor firewire
drive. It worked fine on FC6 a few hours ago, but now after upgrading to FC7,
kernel 2.6.21-1.3194.fc7smp reports:

Jun  4 07:50:30 silver kernel: fw_sbp2: management write failed, rcode 0x13
Jun  4 07:50:30 silver kernel: fw_sbp2: removed sbp2 unit fw1.0
Jun  4 07:50:38 silver kernel: fw_sbp2: Workarounds for node fw1.0: 0x1
(firmware_revision 0xa0b82d, model_id 0x000000)
Jun  4 07:50:38 silver kernel: fw_core: created new fw device fw1 (0 config rom
retries)
Jun  4 07:50:38 silver kernel: fw_sbp2: logged in to sbp2 unit fw1.0 (0 retries)
Jun  4 07:50:38 silver kernel: fw_sbp2:  - management_agent_address:   
0xfffff0030000
Jun  4 07:50:38 silver kernel: fw_sbp2:  - command_block_agent_address:
0xfffff0010000
Jun  4 07:50:38 silver kernel: fw_sbp2:  - status write address:       
0x000100000000
Jun  4 07:50:38 silver kernel: scsi12 : SBP-2 IEEE-1394
Jun  4 07:50:38 silver kernel: scsi 12:0:0:0: Direct-Access     Maxtor   1394
storage     60   PQ: 0 ANSI: 0
Jun  4 07:50:38 silver kernel: SCSI device sdg: 234441648 512-byte hdwr sectors
(120034 MB)
Jun  4 07:50:38 silver kernel: sdg: Write Protect is off
Jun  4 07:50:38 silver kernel: sdg: cache data unavailable
Jun  4 07:50:38 silver kernel: sdg: assuming drive cache: write through
Jun  4 07:50:38 silver kernel: SCSI device sdg: 234441648 512-byte hdwr sectors
(120034 MB)
Jun  4 07:50:38 silver kernel: sdg: Write Protect is off
Jun  4 07:50:38 silver kernel: sdg: cache data unavailable
Jun  4 07:50:38 silver kernel: sdg: assuming drive cache: write through
Jun  4 07:50:38 silver kernel:  sdg: sdg1
Jun  4 07:50:38 silver kernel: sd 12:0:0:0: Attached scsi disk sdg
Jun  4 07:50:38 silver kernel: sd 12:0:0:0: Attached scsi generic sg8 type 0
Jun  4 07:50:38 silver kernel: fw_core: phy config: card 0, new root=ffc0,
gap_count=5
Jun  4 07:50:38 silver kernel: fw_sbp2: reconnected to unit fw1.0 (0 retries)
Jun  4 07:50:40 silver kernel: XFS mounting filesystem sdg1
Jun  4 07:50:40 silver kernel: end_request: I/O error, dev sdg, sector 117553237
Jun  4 07:50:40 silver kernel: I/O error in filesystem ("sdg1") meta-data dev
sdg1 block 0x701b816       ("xlog_bread") error 5 buf count 262144
Jun  4 07:50:40 silver kernel: XFS: empty log check failed
Jun  4 07:50:40 silver kernel: XFS: log mount/recovery failed: error 5
Jun  4 07:50:40 silver kernel: XFS: log mount failed
Jun  4 07:52:04 silver kernel: fw_sbp2: management write failed, rcode 0x13
Jun  4 07:52:04 silver kernel: fw_sbp2: removed sbp2 unit fw1.0
Jun  4 07:52:25 silver kernel: fw_sbp2: Workarounds for node fw1.0: 0x1
(firmware_revision 0xa0b82d, model_id 0x000000)
Jun  4 07:52:25 silver kernel: fw_core: created new fw device fw1 (0 config rom
retries)
Jun  4 07:52:25 silver kernel: fw_sbp2: logged in to sbp2 unit fw1.0 (0 retries)
Jun  4 07:52:25 silver kernel: fw_sbp2:  - management_agent_address:   
0xfffff0030000
Jun  4 07:52:25 silver kernel: fw_sbp2:  - command_block_agent_address:
0xfffff0010000
Jun  4 07:52:25 silver kernel: fw_sbp2:  - status write address:       
0x000100000000

Comment 17 Joseph Sacco 2007-06-04 16:55:45 UTC
I concur. The current kernels for FC7 [2.6.21-1.3194] and rawhide
[2.6.21-1.3200] have broken the firewire stack.

Grrrr....


-Joseph

Comment 18 Alex Kanavin 2007-06-04 17:24:12 UTC
Agreed. The new stack should've been held off until F8. It's supposed to fix problems (and many people 
incloding myself had none to begin with), not introduce them.

Comment 19 Alex Kanavin 2007-06-07 13:50:40 UTC
More troubles with the new stack:
http://www.linux.com/article.pl?sid=07/06/06/1327234

Kristian, what's your take?

"One of the features in F7 is a rewritten FireWire stack for the kernel. I'm not
sure what was supposed to be wrong with the existing FireWire stack, which I've
been using for more than a year for storage and transferring video from a
digital camera to my ThinkPad and desktop machine, but apparently the new stack
is supposed to be better.

I say "supposed to be" because it seems to have a few kinks that need working
out before it's ready for widespread use. Not more than three hours after
installing Fedora, my system beeped twice and locked up tight. I wasn't sure at
first what the cause might have been -- at the time I had several programs open,
including Pirut, Firefox, several terminal windows, OpenOffice.org Writer, and a
few Nautilus windows.

I rebooted to see if I could reproduce the problem. After about two hours, once
again, I received a couple of console beeps and the machine locked up. I checked
the system logs and noticed an error about a hardware problem on the PCI bus. I
thought the problem might be related to enabling desktop effects, so I disabled
desktop effects and went about my business, but it turned out that wasn't the
issue at all.

After disabling desktop effects, I found that I couldn't access my external
FireWire drive, and I was receiving error messages when I attempted to reconnect
it. I thought the problem might have been a dying drive -- until I connected the
drive to my Ubuntu desktop and was able to access it with no problem. Since
removing the drive, I've had no further problems with Fedora, other than a
couple of Firefox crashes.

After getting all my data off the drive, in case the drive really was going
wonky, I kept an eye on it and copied several gigabytes worth of files back and
forth to see if I could cause a failure or even find any error messages. After a
couple of days, I've had no further problems with the drive. I have to conclude
that the problem was the new FireWire stack in F7 rather than the drive."

Comment 20 Dave Jones 2007-08-27 14:20:22 UTC
please open separate bugs for additional problems. 

Thanks.