Bug 105912

Summary: QLA12160 instability (Kernel may crash)
Product: Red Hat Enterprise Linux 3 Reporter: Jun'ichi NOMURA <junichi.nomura>
Component: kernelAssignee: Tom Coughlan <coughlan>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-03-30 18:57:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to limit max queue depth none

Description Jun'ichi NOMURA 2003-09-29 13:54:29 UTC
Description of problem:

When a large sequential I/O to multiple disks through single HBA,
a lot of SCSI time-out message appears for several hard disks
and kernel eventually panics.

The cause of this is low level driver accepts too many requests to handle.
The solution is setting driver parameter to proper value.
Patch will be attached.

Moreover, the version of QLA12160 driver is 3.00 Beta which is very old
and not stable. Even AS 2.1 has 3.23.19 Beta driver.

Version-Release number of selected component (if applicable):
2.4.21-3.EL

How reproducible:
Everytime after typical I/O load.

Steps to Reproduce:
Do following sequence.

1. Connect 4 hard disks to 1 HBA
   (They are called as /dev/sda, /dev/sdb, /dev/sdc, /dev/sdd in following
    steps.)

2. Create partitions larger than 5GB on each disks
      # parted /dev/sda mkpartfs primary ext2 0.000 5000.000
      # parted /dev/sdb mkpartfs primary ext2 0.000 5000.000
      # parted /dev/sdc mkpartfs primary ext2 0.000 5000.000
      # parted /dev/sdd mkpartfs primary ext2 0.000 5000.000

3. Mount the partitions
      # mkdir /mnt/0;  mkdir /mnt/1;  mkdir /mnt/2;  mkdir /mnt/3
      # mount /dev/sda1 /mnt/0
      # mount /dev/sdb1 /mnt/1
      # mount /dev/sdc1 /mnt/2
      # mount /dev/sdd1 /mnt/3

4. Execute the shell script below
      #!/bin/sh
      i=0
      while true; do
        i=`expr $i + 1`
        echo -n "Loop = $i, "
        date
        dd if=/dev/zero of=/mnt/0/zero0 bs=16M count=120 &
        dd if=/dev/zero of=/mnt/1/zero1 bs=16M count=120 &
        dd if=/dev/zero of=/mnt/2/zero2 bs=16M count=120 &
        dd if=/dev/zero of=/mnt/3/zero3 bs=16M count=120
        sync
        sleep 30
      done
    
Actual results:
Kernel will panic or SCSI timeout will occur about an hour after the step 4.

Expected results:
Panic and/or timeout should not occur.

Additional info:
Mid level driver (e.g. scsi.c) puts request into per-LUN queues of low
level drivers and invokes timer for request time-out handling.
The maximum number of queueable requests is determined by 'can_queue'
member of Scsi_Host_Template. qla1280.c sets can_queue as 0xfffff.

Low level driver (e.g. qla1280.c) issues requests using HBA's ring buffer.
The size of ring buffer is hardcoded as 256 (=REQUEST_ENTRY_CNT).

Thus, mid level can puts, at maximum, 0xfffff * (number of LUNs) requests
simultaneously. On the other hand, low level can handle only 256 at the
same time. This causes the situation that many requests remain in
low level waiting for room in ring buffer. ....(1)

The maximum number of issue-able requests per LUN is determined by
hiwat member of bus_param_t.
The hiwat value is set as max_queue_depth (NVRAM parameter) minus 1.
The max_queue_depth seems to be set as 256 and not being able to change
from QLA12160 BIOS utility.

low level driver puts requests from queue to ring buffer until the one
of following conditions is fulfilled:
  - the queue becomes empty
  - the requests of the LUN reaches the hiwat value
  - the ring buffer becomes full

When the ring buffer becomes full, remaining requests will be put back
to the queue. Low level driver will re-try putting into ring buffer
when outstanding I/O will complete.

As the size of the ring buffer and the hiwat value are very close
and the low level driver can put as many request as possible for single
LUN, the ring buffer can be occupied by the requests for the LUN.
Once the occupation occurs, the LUN again gets the next chance to issue
requests after I/O completion.
Thus, the inequality of issueing requests continues. ...(2)

The problems (1) and (2) brings up the situation that many requests for some
LUNs causes request time-out without actually issued to HBAs.
The many time-outs cause kernel panic or disability of normal operation
at least.

Comment 1 Jun'ichi NOMURA 2003-09-29 13:58:07 UTC
Created attachment 94814 [details]
patch to limit max queue depth

This patch does following things:
  - Change can_queue value of Scsi_Host from 0xfffff to REQUEST_ENTRY_CNT.
  - Add static variable qla1280_max_queue_depth_limit.
    Set qla1280_max_queue_depth_limit to REQUEST_ENTRY_CNT/16.
  - If max_queue_depth parameter obtained from NVRAM exceeds the   
qla1280_max_queue_depth_limit,
set max_queue_depth as qla1280_max_queue_depth_limit.

Comment 2 Tom Coughlan 2005-03-30 18:57:20 UTC
The qla1280 is unmaintained in the 2.4 kernel. The RHEL 3 release notes indicate
this, and indicate that Red Hat only supports this driver on x86 platforms. As a
result, changes to this driver are risky and of limited benefit. We are not
planning to apply the requested patch to RHEL 3 for these reasons.