178870 – aic79xx panics on boot with "Adaptec AIC-7902B U320"

Bug 178870 - aic79xx panics on boot with "Adaptec AIC-7902B U320"

Summary: aic79xx panics on boot with "Adaptec AIC-7902B U320"

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	5
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-01-25 00:28 UTC by Konstantin Olchanski
Modified:	2015-01-04 22:24 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-11-24 22:51:22 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Konstantin Olchanski 2006-01-25 00:28:56 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050909 Fedora/1.7.10-1.3.2

Description of problem:
The aic79xx driver in kernel-2.6.15-1.1871_FC5.x86_64 panics when booting with messages about "bad locking" or something like that. kernel-2.6.15-1.1826.2.10_FC5.x86_64 from FC5T2 has same problem. 2.6.9-22.0.1.ELsmp boots okey.

This problem puts me in a quandary (sp?): 2.6.9-22.0.1.ELsmp has working SCSI and 2.6.15-1.x has working SATA (sata_mv for Marvell MV88SX5081 8-port SATA), but neither kernel can access both disks and scsi tapes at the same time. Ouch!

Bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=175309 may be the same or related.

The lspci entries are:
02:03.0 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)
02:03.1 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)

The dmesg entries are (2.6.9-22.0.1.ELsmp):
SCSI subsystem initialized
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 50-66Mhz, 512 SCBs
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 50-66Mhz, 512 SCBs

More info available on request.

K.O.


Version-Release number of selected component (if applicable):
kernel-2.6.15-1.1871_FC5

How reproducible:
Always

Steps to Reproduce:
1. blah...
2.
3.


Additional info:

Comment 1 Sammy 2006-01-25 14:24:56 UTC

Somehow my U320 ones are working with kernel-2.6.15-1.1826.2.10_FC5.x86_64
(did not try the newer kernels). My dmesg lines are identical to yours. My
lspci lines are a bit different:

05:04.0 RAID bus controller: Adaptec ASC-39320(B) U320 w/HostRAID (rev 10)
05:04.1 RAID bus controller: Adaptec ASC-39320(B) U320 w/HostRAID (rev 10)

Comment 2 Dave Jones 2006-01-26 03:16:54 UTC

can you capture those bad locking messages ? even a digital camera photo would
be better than nothing.  (Booting with vga=791 [or vga=1 if your monitor wont do
791] will also get more lines of text onscreen).

Comment 3 Konstantin Olchanski 2006-01-27 02:02:26 UTC

To answer your question (can you ...?) about capture of panic data, in the stone
age of Silicon Graphics IRIX machines circa 1992, there was a command to email
you (SGI support) the report from the last panic (after a panic(), all the RAM
contents was dumped to disk, then you gdb them like a normal core dump). Now, in
the modern day of digital cameras, I will type in the stack trace from the
little screen of my digital camera. Definitely better than pencil and paper.
Here goes:

on boot, messages about first scsi channel (no devices there)
then messages about the second scsi channel (SDLT tape robot, two scsi devices:
robot and tape drive)
(so far, same as the messages in my report above)
then, where the good kernel reports speed negotiations (after some haggling, the
tape drive eventually negotiates 40 Mbytes/sec (20 MHz, 16bit) and the robot
settles on 10 (or 20???) Mbytes/sec),
the bad one spews a bunch of scsi card state dump (too fast to capture)
then after a short delay there is a panic() with a stack trace like this:

panic ...: bad locking

panic
show_trace
show_trace {_raw_spin_unlock+46}
_spin_unlock_irq_restore+9 :aic79xx:ahd_linux_queue_recovery_cmd+2578}
:aic79xx:ahd_linux_sem_timeout+0
:aic79xx:ahd_linux_queue_recovery_cmd+2
keventd_create_kthread+0
:scsi_mod:scsi_error_handler+1231
:scsi_mod:scsi_error_handler+0
keventd_create_kthread+0
... (I am tired of typing: boring kevent stuff follows)

I have the digital image trapped in the camera, will get it out at home and
attach to this bug report.

K.O.

Comment 4 Dave Jones 2006-03-06 05:23:15 UTC

Is this still happening with the latest kernel ?   (There's an even newer one
than the one in rawhide at http://people.redhat.com/davej/kernels/Fedora/devel/)

Comment 5 Konstantin Olchanski 2006-03-06 05:42:49 UTC

Thanks for the pointer. I will try kernel-2.6.15-1.2016_FC5.x86_64.rpm tomorrow.
K.O.

Comment 6 Dave Jones 2006-10-17 00:04:01 UTC

A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 7 Dave Jones 2006-11-24 22:51:22 UTC

This bug has been mass-closed along with all other bugs that
have been in NEEDINFO state for several months.

Due to the large volume of inactive bugs in bugzilla, this
is the only method we have of cleaning out stale bug reports
where the reporter has disappeared.

If you can reproduce this bug after installing all the
current updates, please reopen this bug.

If you are not the reporter, you can add a comment requesting
it be reopened, and someone will get to it asap.

Thank you.

Note You need to log in before you can comment on or make changes to this bug.