108153 – kernel error messages being generated by aic7xxx

Bug 108153 - kernel error messages being generated by aic7xxx

Summary: kernel error messages being generated by aic7xxx

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-10-28 03:37 UTC by gezp usop
Modified:	2007-11-30 22:06 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2005-10-18 03:03:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description gezp usop 2003-10-28 03:37:21 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030922

Description of problem:
/var/log/messages is growing at an astounding rate (50+ MB/day) with the
following message:

Oct 26 20:11:24 millennium kernel: (scsi0:A:3:0): No or incomplete CDB sent to
device.
Oct 26 20:11:24 millennium kernel: (scsi0:A:3:0): Protocol violation in
Message-in phase.  Attempting to abort.
Oct 26 20:11:24 millennium kernel: (scsi0:A:3:0): Abort Message Sent
Oct 26 20:11:24 millennium kernel: (scsi0:A:3:0): SCB 4 - Abort Completed.
Oct 26 20:11:25 millennium kernel: (scsi0:A:3:0): refuses WIDE negotiation. 
Using 8bit transfers

The 3rd SCSI device that the message seems to be refering to is a Yamaha
CRW4416S CD-RW.

This morning the machine had frozen and I had to do a cold reboot to get it to
work again.

When I googled the message I found that it may be due to an older version of the
aic7xxx kernel module.  

Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-4.EL

How reproducible:
Always

Steps to Reproduce:
1. boot up the system
2. view the log file
    

Actual Results:  The log file begins to grow and sometime in the next day it
will freeze (kernel panic?) and I will have to restart.  

Additional info:

Comment 1 Doug Ledford 2003-10-29 19:08:16 UTC

Justin, does this problem ring any bells?  This is using 6.2.36 + the CHIPRST fix.

Comment 2 Justin T. Gibbs 2003-10-29 19:37:58 UTC

The error indicates that the target did not transfer all bytes of the CDB
prior to performing some phase that requires a successfully transferred
CDB.

Smells like some userland application is sending in a command with
the wrong CDB length.  I've seen this lots of times with improperly
written OEM utilities.  Perhaps there is some bug in the "media detection
daemon" for CD devices?

Comment 3 Doug Ledford 2003-10-29 20:40:05 UTC

Hmmm...possibly, although the disk detection software hasn't changed much in the
last year or so.  So, since this is a new type of bug report it would mean the
older driver tolerated the situation while the new one doesn't.

gezp:  If you are using the standard desktop, can you go to Preferences->CD
Properties and turn off all the options present.  That should disable magicdev
entirely and if it's causing the problem, the messages in your log should stop
after you make those changes.  Let me know if the messages do stop after doing
that (or the equivelant thing under KDE if that's what you use).

Comment 4 Justin T. Gibbs 2003-10-29 20:42:46 UTC

Just FYI, the protocol violation checking *is* a fairly recent addition
to the driver.  It was added so we could pass certain OEM suites that
test the behavior of our driver with faulty targets.

Comment 5 Doug Ledford 2003-10-29 20:44:21 UTC

What's the chance this is a faulty target instead of a broken SCSI command?

Comment 6 Justin T. Gibbs 2003-10-29 20:48:49 UTC

I think its much more likely to be faulty software than a device.  I've
only seen "real errors" due to software outside of the test suite.

BTW, you may also be seeing "underflow" errors from this version of the
driver.  This is due to the scsi_unique_id utility using the old SCSI
pass-thru mechanism that doesn't properly set the underflow field.  When
reading things like serial numbers, scsi_unique_id should be setting
underflow to 0 since a variably sized response is expected.  Unfortunately,
the old pass-thru interface doesn't allow this.  Because of this problem,
I have disabled underflow error reporting in the most recent release of
the driver.

Comment 7 gezp usop 2003-10-30 03:32:06 UTC

Two comments:  I removed the Yamaha CDRW drive and averything went back to 
normal.  No log messages, no freezing, everything was fine.
I then turned off all CD preferences as Doug asked, rebooted and I kept getting 
error messages:

Oct 29 21:31:22 millennium kernel: cdrom: This disc doesn't have any tracks I 
recognize!
Oct 29 21:31:23 millennium kernel: Device not ready.  Make sure there is a disc 
in the drive.
Oct 29 21:31:54 millennium last message repeated 15 times
Oct 29 21:32:46 millennium last message repeated 26 times

and:

Oct 29 21:33:22 millennium kernel: Device not ready.  Make sure there is a disc 
in the drive.
Oct 29 21:33:54 millennium last message repeated 16 times
Oct 29 21:34:55 millennium last message repeated 30 times

Comment 8 Cristiano Duarte 2004-02-11 16:09:23 UTC

I'm experiencing the same bug with a sceptre s1200 scanner. It was
detected by RedHat9 kernel, but when I upgraded to Fedora Core 1, the
scanner isn't recognized by the kernel anymore (the adaptec card still
detects it) and the following error is reported on boot:
(scsi0:A:3:0): No or incomplete CDB sent to device.
(scsi0:A:3:0): Protocol violation in Message-in phase.  Attempting to
abort.
(scsi0:A:3:0): Abort Message Sent
(scsi0:A:3:0): SCB 3 - Abort Completed.

The scsi ZIP drive is detected by the Adaptec card and by the kernel.
I tried to force the detection with this command:
# echo "scsi add-single-device 0 0 3 0" > /proc/scsi/scsi
But I still got the same error messages, this time only in
/var/log/messages(not standard output).

My OS: Fedora Core 1, kernel 2.4.22-1.2166.nptl
My HW: Pentium III 933Mhz, 256Mb, HD IDE 40Gb, Adaptec AHA-2940UW
SCSI Peripherals: Zip Drive 100Mb, Sceptre S1200 scanner

Comment 9 Cristiano Duarte 2004-02-19 14:46:27 UTC

The Redhat9 kernel (2.4.20) recognizes the scanner but doesn't let
utilities talk to scsi devices(results in garbage).
The Fedora Core 1 kernel (2.4.22) doesn't even recognize the scanner.
A downgrade to RedHat 7.3 kernel(2.4.18) made everything work. The
kernel recognizes the scanner and the scanner utility was able to talk
to the scanner.

I found out that some BUG was introduced in kernels above 2.4.18. My
SCSI card is Adaptec AHA2940UW and uses the aic7xxx kernel module. It
seems there is no maintainer for this kernel module now, so who's
gonna save us ! :-(

Comment 10 Doug Ledford 2005-10-18 03:03:35 UTC

This is an extremely old report.  Closing due to lack of activity.  If the
problem still exists with current versions of software, then please open a new
bug report reflecting that.

Note You need to log in before you can comment on or make changes to this bug.