Bug 184233 - aacraid driver io_callback races scsi_times_out
Summary: aacraid driver io_callback races scsi_times_out
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Chip Coldwell
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-07 16:08 UTC by Chip Coldwell
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-04-18 19:09:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
complete kernel log (59.63 KB, text/plain)
2006-03-07 16:08 UTC, Chip Coldwell
no flags Details

Description Chip Coldwell 2006-03-07 16:08:33 UTC
What's happening is a race between I/O completions (io_callback) and
the timeout timer function (scsi_times_out).  We had a crash with all
SCSI logging enabled that produced a vmcore, so I was able to identify
all of the addresses.  The unadulterated log is attached, but here it
is with my annotations:

- scsi_request_fn (indirect via request_queue_t->request_fn)
  - scsi_allocate_device 
    "Activating command for device 0 (2)"
    numbers are target and host_active.  Function increments host_active
  - sd_init_command (indirect via scsi_template->init_command)
    "Doing sd request, dev = 0x801, block = 9112"
    "sda : real dev = /dev/0, block = 9175"
    "sda : writing 16/16 512 byte blocks."
  - scsi_dispatch_command
    - scsi_add_timer
      "Adding timer for command c98a1000 at 6000 (f8812da0)"
      Install a 60-second timeout with the timer function scsi_times_out.
    "scsi_dispatch_cmnd (host = 1, channel = 0, target = 0, command = "
    "c98a105c, buffer = c0542000, bufflen = 8192, done = f882bd90)"
    done = f882bd90 is rw_intr (drivers/scsi/sd.c: 648)
    "queuecommand : routine at f8860090"
    f8860090 is aac_queuecommand; 
    "leaving scsi_dispatch_cmnd()"

- scsi_times_out (timer function)
    "Command timed out active=2 busy=2 failed=1"
    Wakes up the error handler thread.
    Sets scsi_cmnd->state = SCSI_STATE_TIMEOUT
    Sets scsi_cmnd->owner = SCSI_OWNER_ERROR_HANDLER

- io_callback
  - aac_io_done
    - scsi_add_timer
      "Adding timer for command c98a1000 at 6000 (f8861280)"
      Replaces scsi_times_out timer with aacraid_timeout_dummy_fn in an
      attempt to prevent scsi_times_out from running if the command
      completes.
    - scsi_done
      - scsi_delete_timer
        "Clearing timer for command c98a1000 1"
	Returns 1 (the second number) even though command timed out
	because of dummy_fn installed by aac_io_done.  This thread
	finishes by raising the SCSI_SOFTIRQ.

      N.B. we do not get the message "Ignoring completion of %p due to
      timeout status" that would be expected if the timeout function
      had run by now.  In other words, the scsi_cmnd->state has not
      yet been changed to SCSI_STATE_UNUSED; we're racing with the
      timeout function.  We might have detected the timeout expiring
      if it hadn't been replaced by aacraid_timeout_dummy_fn.

- scsi_softirq_handler
  - scsi_decide_disposition
  "Command finished 2 1 0x0"
  scsi_decide_disposition returned SUCCESS.  The numbers are
  host_busy, host_failed and scsi_cmnd->result.
  - scsi_finish_command
  "Notifying upper driver of completion for device 0 0"
  The numbers are device->id and scsi_cmnd->result
  - sd.c:rw_intr (indirect via scsi_cmnd->done)
  "sda : rw_intr(1, 0 [0 0])"
  The numbers are host number, result, sense buffer error code and sense key.
    - scsi_io_completion
    "16 sectors total, 16 sectors done."
    "use_sg is 2"
    Used a two element (page) scatterlist (8192 bytes on two pages).
      - __scsi_end_request
        - __scsi_release_command
        "Deactivating command for device 0 (active=1, failed=1)"
	sets scsi_cmnd->state = SCSI_STATE_UNUSED
	sets scsi_cmnd->owner = SCSI_OWNER_NOBODY
	"Error handler thread 1 0"
	numbers are host->in_recovery and host->eh_active
  "Waking error handler thread (-1)"
  Now we're going to wake up the error handler thread to work on the
  command we just released.

The rest of the log file is what you would expect to happen if the
error handler runs with host_failed > 0 but zero failed commands in
the host queue.

Error handler waking up
scsi_unjam_host: Checking to see if we need to request sense
Command to ID 0 timedout
Total of 0+1 commands on 1 devices require eh work
scsi_unjam_host: Checking to see if we want to try abort
scsi_unjam_host: Checking to see if we want to try BDR
scsi_unjam_host: Try hard bus reset
Sleeping for timer tics 500
queuecommand : routine at f8860090
leaving scsi_dispatch_cmnd()
Adding timer for command c98a1c00 at 6000 (f8861280)
Clearing timer for command c98a1c00 1
Command finished 1 1 0x0
Notifying upper driver of completion for device 0 0
sda : rw_intr(1, 0 [0 0])
8 sectors total, 8 sectors done.
use_sg is 1
 Deactivating command for device 0 (active=0, failed=1)
Error handler thread 1 1
scsi_unjam_host: Take device offline
Kernel panic: scsi_unjam_host: Miscount of number of failed commands.

Comment 1 Chip Coldwell 2006-03-07 16:08:34 UTC
Created attachment 125761 [details]
complete kernel log

Comment 2 Chip Coldwell 2006-03-07 16:09:56 UTC
The vmcore is here:

ftp://enterprise.redhat.com/incoming/bz171473-adaptec-vmcore.tar.gz

Comment 3 Chip Coldwell 2006-03-07 16:37:31 UTC
From mark_salyzyn Tue Mar  7 11:33:40 2006
Date: Tue, 7 Mar 2006 11:33:26 -0500
From: "Salyzyn, Mark" <mark_salyzyn>
To: Chip Coldwell <coldwell>
Cc: Tom Coughlan <coughlan>
Subject: RE: race in AAC Linux driver

The driver we supplied for RHEL3 U7 was 1.1.5-2412. U6 used 1.1.5-2361
which is a branch of 1.1.5-2340 with one fix for aac_info. The CHANGELOG
was not updated in your sources when we submitted the driver source
update so it can not be used as a guide, look in aacraid.h for
AAC_DRIVER_BUILD for the build portion of the version number. I checked
the sources in 2.4.21-37.EL and it is in fact using the older driver, I
believe we solved such an issue in 1.1.5-2343 by changing all the code
that manipulated the timers.

Tom, I would have expected the driver we worked with you to put into U7
should resolve this problem.

Chip, it is your call if you want an investigation into an incremental
fix for the aging driver in 2.4.21-37.EL, or if you jump to the one to
be released in a weeks time with RHEL3 U7.

Sincerely -- Mark Salyzyn

> -----Original Message-----
> From: Chip Coldwell [coldwell] 
> Sent: Tuesday, March 07, 2006 11:07 AM
> To: Salyzyn, Mark
> Subject: race in AAC Linux driver
> 
> 
> Hi,
> 
> I got your contact info from Tom Coughlan.  I'm working on a bug in
> the aacraid driver that we ship with Red Hat Enterprise Linux version
> 3 (kernel 2.4.21-37.ELsmp; driver version 1.1.5-2340 judging from the
> contents of the CHANGELOG).  What's happening is a race between I/O
> completions (io_callback) and the timeout timer function
> (scsi_times_out).  I think the problem boils down to one of needing to
> use del_timer_sync when changing the SCSI timeout, but I welcome your
> opinion.
> 
> We had a crash with all
> SCSI logging enabled that produced a vmcore, so I was able to identify
> all of the addresses.  The unadulterated log is attached, but here it
> is with my annotations:
> 
> - scsi_request_fn (indirect via request_queue_t->request_fn)
>    - scsi_allocate_device
>      "Activating command for device 0 (2)"
>      numbers are target and host_active.  Function increments 
> host_active
>    - sd_init_command (indirect via scsi_template->init_command)
>      "Doing sd request, dev = 0x801, block = 9112"
>      "sda : real dev = /dev/0, block = 9175"
>      "sda : writing 16/16 512 byte blocks."
>    - scsi_dispatch_command
>      - scsi_add_timer
>        "Adding timer for command c98a1000 at 6000 (f8812da0)"
>        Install a 60-second timeout with the timer function 
> scsi_times_out.
>      "scsi_dispatch_cmnd (host = 1, channel = 0, target = 0, 
> command = "
>      "c98a105c, buffer = c0542000, bufflen = 8192, done = f882bd90)"
>      done = f882bd90 is rw_intr (drivers/scsi/sd.c: 648)
>      "queuecommand : routine at f8860090"
>      f8860090 is aac_queuecommand;
>      "leaving scsi_dispatch_cmnd()"
> 
> - scsi_times_out (timer function)
>      "Command timed out active=2 busy=2 failed=1"
>      Wakes up the error handler thread.
>      Sets scsi_cmnd->state = SCSI_STATE_TIMEOUT
>      Sets scsi_cmnd->owner = SCSI_OWNER_ERROR_HANDLER
> 
> - io_callback
>    - aac_io_done
>      - scsi_add_timer
>        "Adding timer for command c98a1000 at 6000 (f8861280)"
>        Replaces scsi_times_out timer with 
> aacraid_timeout_dummy_fn in an
>        attempt to prevent scsi_times_out from running if the command
>        completes.
>      - scsi_done
>        - scsi_delete_timer
>          "Clearing timer for command c98a1000 1"
>  	Returns 1 (the second number) even though command timed out
>  	because of dummy_fn installed by aac_io_done.  This thread
>  	finishes by raising the SCSI_SOFTIRQ.
> 
>        N.B. we do not get the message "Ignoring completion of 
> %p due to
>        timeout status" that would be expected if the timeout function
>        had run by now.  In other words, the scsi_cmnd->state has not
>        yet been changed to SCSI_STATE_UNUSED; we're racing with the
>        timeout function.  We might have detected the timeout expiring
>        if it hadn't been replaced by aacraid_timeout_dummy_fn.
> 
> - scsi_softirq_handler
>    - scsi_decide_disposition
>    "Command finished 2 1 0x0"
>    scsi_decide_disposition returned SUCCESS.  The numbers are
>    host_busy, host_failed and scsi_cmnd->result.
>    - scsi_finish_command
>    "Notifying upper driver of completion for device 0 0"
>    The numbers are device->id and scsi_cmnd->result
>    - sd.c:rw_intr (indirect via scsi_cmnd->done)
>    "sda : rw_intr(1, 0 [0 0])"
>    The numbers are host number, result, sense buffer error 
> code and sense key.
>      - scsi_io_completion
>      "16 sectors total, 16 sectors done."
>      "use_sg is 2"
>      Used a two element (page) scatterlist (8192 bytes on two pages).
>        - __scsi_end_request
>          - __scsi_release_command
>          "Deactivating command for device 0 (active=1, failed=1)"
>  	sets scsi_cmnd->state = SCSI_STATE_UNUSED
>  	sets scsi_cmnd->owner = SCSI_OWNER_NOBODY
>  	"Error handler thread 1 0"
>  	numbers are host->in_recovery and host->eh_active
>    "Waking error handler thread (-1)"
>    Now we're going to wake up the error handler thread to work on the
>    command we just released.
> 
> The rest of the log file is what you would expect to happen if the
> error handler runs with host_failed > 0 but zero failed commands in
> the host queue.
> 
> Error handler waking up
> scsi_unjam_host: Checking to see if we need to request sense
> Command to ID 0 timedout
> Total of 0+1 commands on 1 devices require eh work
> scsi_unjam_host: Checking to see if we want to try abort
> scsi_unjam_host: Checking to see if we want to try BDR
> scsi_unjam_host: Try hard bus reset
> Sleeping for timer tics 500
> queuecommand : routine at f8860090
> leaving scsi_dispatch_cmnd()
> Adding timer for command c98a1c00 at 6000 (f8861280)
> Clearing timer for command c98a1c00 1
> Command finished 1 1 0x0
> Notifying upper driver of completion for device 0 0
> sda : rw_intr(1, 0 [0 0])
> 8 sectors total, 8 sectors done.
> use_sg is 1
>   Deactivating command for device 0 (active=0, failed=1)
> Error handler thread 1 1
> scsi_unjam_host: Take device offline
> Kernel panic: scsi_unjam_host: Miscount of number of failed commands.
> 
> Chip
> 
> -- 
> Charles M. "Chip" Coldwell
> Senior Software Engineer
> Red Hat, Inc
> 


Comment 6 Chip Coldwell 2006-04-18 19:09:35 UTC
(In reply to comment #5)
> This bug is basically a red herring.  If you are logging every SCSI transaction
> to a file stored on a SCSI disk, you are going to get in trouble somehow.  This
> is how it happened for this particular driver.
> 
> Recommending that we close this as NOTABUG.

Closing as NOTABUG.

Chip



Note You need to log in before you can comment on or make changes to this bug.