171957 – rmmod aic79xx causes kernel panic

Bug 171957 - rmmod aic79xx causes kernel panic

Summary: rmmod aic79xx causes kernel panic

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-10-28 09:08 UTC by Didier
Modified:	2007-11-30 22:11 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-11-10 18:56:09 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
kernel trace from kernel-2.6.14-1.1632_FC5kdump (6.11 KB, text/plain) 2005-10-28 18:07 UTC, Didier	no flags	Details
screen dump from 2.6.13-1.1532_FC4 kernel oops (3.52 MB, image/jpeg) 2005-11-02 08:54 UTC, Didier	no flags	Details
View All

Description Didier 2005-10-28 09:08:34 UTC

Description of problem:
With an attached SCSI device, rmmod'ing the aic79xx module causes kernel panic.


Version-Release number of selected component (if applicable):
kernel-2.6.13-1.1532_FC4


How reproducible:
Always.


Steps to Reproduce:
1. Attach SCSI device
2. # modprobe aic79xx
3. # rmmod aic79xx


Actual results:
/var/log/messages displays "kernel: Synchronizing SCSI cache for disk sda:"
and oopses after approx. 5-7 seconds.
The oops is displayed on-screen, but not written to /var/log/messages (located
on IDE bootdisk).
[note : after the weekend, I'll take a digital image of the screen, and post as
an attachment]

Expected results:
Kernel should unload module.


Additional info:

* As long as no SCSI device has been attached when loading the aic79xx,
unloading works OK.

* Reason for rmmod'ing : Attached to SCSI card is a JetStor RAID array. When the
array is expanded, an rmmod is needed to let the kernel recognize the enlarged
disk geometry (the alternative is a reboot, which is not always an option in a
production environment).

* SCSI card is an Adaptec 29320LP :
# cat /proc/scsi/aic79xx/0
Adaptec AIC79xx driver version: 1.3.11
Adaptec 29320LP Ultra320 SCSI adapter
aic7901A: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs
Allocated SCBs: 4, SG List Length: 128

* Tested with :
- both Adaptec 29320LP (BIOS 4.10.0S) and 29320ALP (BIOS 4.30.1)
- in different PCI slots
- with and without kernel boot parameter "acpi=nopci"

* Not yet tested on RHEL AS4 (hesitant to lock up my production servers).

Comment 1 Didier 2005-10-28 15:38:42 UTC

According to http://www.ussg.iu.edu/hypermail/linux/kernel/0409.3/1247.html,
aic79xx 1.3.11 dates from 2003 and is quite stale.
Most recent version (http://people.freebsd.org/~gibbs/linux/SRC/ , kindly
appointed to it by Justin Gibbs) has its share of problems too, appearantly.

Possible fix in http://lkml.org/lkml/2005/10/3/106 (James Bottomley) for 2.6.14-rc4.

I'm fully prepared to test this with a Fedora Rawhide kernel, but as this setup
(Adaptec 29320ALP + SCSI RAID) will be moved to a production environment, is
there any chance the fix will be propagated to RHEL4 on (relatively) short notice ?

- Is it useful to follow up on this bug ?
- Should I file an RFE on RHEL, referring to this bug ?

Comment 2 Didier 2005-10-28 18:04:58 UTC

Additional remarks :
- "rmmod aic79xx" confirmed fixed in rawhide 2.6.14-1.1632_FC5 ;
- subsequent "modprobe aic79xx" oopses kernel.

Comment 3 Didier 2005-10-28 18:07:52 UTC

Created attachment 120513 [details]
kernel trace from kernel-2.6.14-1.1632_FC5kdump

Oct 28 18:21:40 : "# rmmod aic79xx"
Oct 28 18:22:00 : "# modprobe aic79xx"

Comment 4 Didier 2005-11-02 08:54:37 UTC

Created attachment 120633 [details]
screen dump from 2.6.13-1.1532_FC4 kernel oops

As indicated in the original bug report, I'm providing a screen dump (initlevel
1) of the kernel panic when rmmod'ing aic79xx with attached SCSI device.
Appearantly, the system hangs after the rmmod, and oopses when manually
(SysRq-S) syncing.

Comment 5 Didier 2005-11-10 17:01:48 UTC

Seems to be fixed in 2.6.14-1.1636_FC4 .

Comment 6 Dave Jones 2005-11-10 20:30:44 UTC

2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.

Note You need to log in before you can comment on or make changes to this bug.