Bug 106791 - aic_7xxx DRIVER_LOCK blocks interrupts
Summary: aic_7xxx DRIVER_LOCK blocks interrupts
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel
Version: 2.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-10-10 17:08 UTC by Kevin Krafthefer
Modified: 2007-11-30 22:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-10-17 15:43:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Kevin Krafthefer 2003-10-10 17:08:33 UTC
Description of problem:
The DRIVER_LOCK blocks interrupts while it does its
while loop. RHEL3's driver seems to not have this problem.

Version-Release number of selected component (if applicable):
e.27

How reproducible:
fleetingly

Steps to Reproduce:
1. Run e.27 with aic_7xxx
2. Wait
3.
    
Actual results:


Expected results:


Additional info:

Comment 1 Kevin Krafthefer 2003-10-10 17:10:46 UTC
See issue tracker 28301
It looks like the real kernel trace would be something like this:

EIP is at aic7xxx_handle_scsiint [aic7xxx] 0x258
eax: 0000000d ebx: f7745084 ecx: f8848000 edx: 00000000
esi: f8848000 edi: 00000000 ebp: 00000000 esp: c22f9bc8
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c22f9000)
Stack: 00000013 f8a89000 f7370b20 00000206 d83d7178 c0ab6f00 00000020 d83d7040
c0201c37 d83d7040 c2653360 000004ec 00000001 00000056 ea57dd6c 00000001
00000286 cdeae5a0 c011970d ea57c000 00000282 d83d7040 00000001 03000014
aic7xxx_isr
do_aic7xxx_isr
handle_IRQ_event
do_IRQ
call_do_IRQ
aic7xxx_done_cmds_complete  (takes SCSI interrupt)
aic7xxx_handle_scsiint
aic7xxx_isr
aic7xxx_abort
scsi_abort
scsi_old_times_out
__run_timers
run_local_timers
smp_apic_timer_interrupt [kernel] 0xb8
do_IRQ [kernel] 0xe3
default_idle [kernel] 0x0

The CPU was in idle, took a timer interrupt, run some timer
functions that were due, including the scsi_old_times_out()
function.  It determined that a SCSI command has timed out,
so it issued an abort.  This led to aic7xxx_done_cmds_complete(),
which took a SCSI interrupt while it was executing.  In handling
the interrupt, aic7xxx_isr() called aic7xxx_handle_scsiint()
which panicked on a NULL reference at 0x258 bytes into the
function.  So -- my guess is that it was touching something
that aic7xxx_done_cmds_complete() was fiddling with when it
took the SCSI interrupt.  What I find interesting is the
changes made to that function between AS2.1 and RHEL3.

Here's the AS2.1 version:

static void
aic7xxx_done_cmds_complete(struct aic7xxx_host *p)
{
 Scsi_Cmnd *cmd;

 while (p->completeq.head != NULL)
 {
   cmd = p->completeq.head;
   p->completeq.head = (Scsi_Cmnd *)cmd->host_scribble;
   cmd->host_scribble = NULL;
   cmd->scsi_done(cmd);
 }
}

Here's the RHEL3 version:

static void
aic7xxx_done_cmds_complete(struct aic7xxx_host *p)
{
 Scsi_Cmnd *cmd;
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,1,95)
 unsigned int cpu_flags = 0;
#endif

 DRIVER_LOCK
 while (p->completeq.head != NULL)
 {
   cmd = p->completeq.head;
   p->completeq.head = (Scsi_Cmnd *)cmd->host_scribble;
   cmd->host_scribble = NULL;
   cmd->scsi_done(cmd);
 }
 DRIVER_UNLOCK
}

The DRIVER_LOCK blocks interrupts while it does its
while loop.  In the AS2.1 version, it would appear
vulnerable unless they already were blocked.


Note You need to log in before you can comment on or make changes to this bug.