Bug 171957

Summary: rmmod aic79xx causes kernel panic
Product: [Fedora] Fedora Reporter: Didier <d.bz-redhat>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-11-10 18:56:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel trace from kernel-2.6.14-1.1632_FC5kdump
none
screen dump from 2.6.13-1.1532_FC4 kernel oops none

Description Didier 2005-10-28 09:08:34 UTC
Description of problem:
With an attached SCSI device, rmmod'ing the aic79xx module causes kernel panic.


Version-Release number of selected component (if applicable):
kernel-2.6.13-1.1532_FC4


How reproducible:
Always.


Steps to Reproduce:
1. Attach SCSI device
2. # modprobe aic79xx
3. # rmmod aic79xx


Actual results:
/var/log/messages displays "kernel: Synchronizing SCSI cache for disk sda:"
and oopses after approx. 5-7 seconds.
The oops is displayed on-screen, but not written to /var/log/messages (located
on IDE bootdisk).
[note : after the weekend, I'll take a digital image of the screen, and post as
an attachment]

Expected results:
Kernel should unload module.


Additional info:

* As long as no SCSI device has been attached when loading the aic79xx,
unloading works OK.

* Reason for rmmod'ing : Attached to SCSI card is a JetStor RAID array. When the
array is expanded, an rmmod is needed to let the kernel recognize the enlarged
disk geometry (the alternative is a reboot, which is not always an option in a
production environment).

* SCSI card is an Adaptec 29320LP :
# cat /proc/scsi/aic79xx/0
Adaptec AIC79xx driver version: 1.3.11
Adaptec 29320LP Ultra320 SCSI adapter
aic7901A: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs
Allocated SCBs: 4, SG List Length: 128

* Tested with :
- both Adaptec 29320LP (BIOS 4.10.0S) and 29320ALP (BIOS 4.30.1)
- in different PCI slots
- with and without kernel boot parameter "acpi=nopci"

* Not yet tested on RHEL AS4 (hesitant to lock up my production servers).

Comment 1 Didier 2005-10-28 15:38:42 UTC
According to http://www.ussg.iu.edu/hypermail/linux/kernel/0409.3/1247.html,
aic79xx 1.3.11 dates from 2003 and is quite stale.
Most recent version (http://people.freebsd.org/~gibbs/linux/SRC/ , kindly
appointed to it by Justin Gibbs) has its share of problems too, appearantly.

Possible fix in http://lkml.org/lkml/2005/10/3/106 (James Bottomley) for 2.6.14-rc4.

I'm fully prepared to test this with a Fedora Rawhide kernel, but as this setup
(Adaptec 29320ALP + SCSI RAID) will be moved to a production environment, is
there any chance the fix will be propagated to RHEL4 on (relatively) short notice ?

- Is it useful to follow up on this bug ?
- Should I file an RFE on RHEL, referring to this bug ?



Comment 2 Didier 2005-10-28 18:04:58 UTC
Additional remarks :
- "rmmod aic79xx" confirmed fixed in rawhide 2.6.14-1.1632_FC5 ;
- subsequent "modprobe aic79xx" oopses kernel.

Comment 3 Didier 2005-10-28 18:07:52 UTC
Created attachment 120513 [details]
kernel trace from kernel-2.6.14-1.1632_FC5kdump

Oct 28 18:21:40 : "# rmmod aic79xx"
Oct 28 18:22:00 : "# modprobe aic79xx"

Comment 4 Didier 2005-11-02 08:54:37 UTC
Created attachment 120633 [details]
screen dump from 2.6.13-1.1532_FC4 kernel oops

As indicated in the original bug report, I'm providing a screen dump (initlevel
1) of the kernel panic when rmmod'ing aic79xx with attached SCSI device.
Appearantly, the system hangs after the rmmod, and oopses when manually
(SysRq-S) syncing.

Comment 5 Didier 2005-11-10 17:01:48 UTC
Seems to be fixed in 2.6.14-1.1636_FC4 .



Comment 6 Dave Jones 2005-11-10 20:30:44 UTC
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.