Bug 115391 - Problems scanning multiple LUNs with aic7xxx module
Problems scanning multiple LUNs with aic7xxx module
Status: CLOSED CANTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-02-11 16:08 EST by Derek Suzuki
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-19 09:41:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Suzuki 2004-02-11 16:08:03 EST
Description of problem:
I have a pair of RHEL 3 servers connected via Adaptec 29320LP adapters
to an nStor 4110S hardware RAID SCSI array.  The array is configured
with a pair of volumes, at LUN 0 and LUN 1 (target 1).

With kernel-smp-2.4.21-4.0.2, only the first LUN would be detected at
boot time.  I tried adding "options scsi_mod max_scsi_luns=127" to
modules.conf, running mkinitrd, and appending "max_scsi_luns=127" to
the kernel line in grub.conf.  None of these made any difference. 
When I manually loaded scsi_mod, I got a message saying that the
max_scsi_luns option was not supported.

Recently I upgraded to kernel-smp-2.4.21-9.  Immediately I saw new
behavior.  When the aic79xx driver loads, it spews messages to the
console which suggest that it is trying and failing to probe all of
the LUNs (up through 255) on target 1.  This takes about ten minutes
to complete.  Afterwards, system startup continues normally and the
two LUNs are successfully configured as /dev/sda and /dev/sdb.

This behavior occurred out of the box, without max_scsi_luns being
set.  I tried setting the option in modules.conf, the initrd and
grub.conf with max_scsi_luns=3, but the driver still probes everything
through LUN 255.

Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-9

How reproducible:


Steps to Reproduce:
1.Configure SCSI RAID array with two LUNs
2.Attach array to Adaptec 29320LP
3.Boot system
  
Actual results:
System startup pauses for ~10 minutes as it probes LUNs and reports
errors to console.  Afterwards, the system starts normally.

Expected results:
I would expect the driver to only probe LUN 0 if max_scsi_luns is not
set.  If I do set it, I would expect not expect the probing to
continue passed the LUN number specified in max_scsi_luns.  The RHEL 3
update 1 release notes also imply that probing should stop as soon as
a non-responsive LUN is tested.

Additional info:
Sample console output when aic79xx module loads:

Feb  7 00:13:28 qadb1 kernel: TAT0[0x0] 
Feb 7 00:13:28 qadb1 kernel: SSTAT1[0x11] SSTAT2[0x0] SSTAT3[0x0]
PERRDIAG[0x0] 
                                                                     
          
Feb  7 00:13:28 qadb1 kernel: SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] 
LQISTAT2[0x0] 
Feb  7 00:13:28 qadb1 kernel: LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x80] 
Feb  7 00:13:28 qadb1 kernel: 
Feb 7 00:13:28 qadb1 kernel: SCB Count = 6 CMDS_PENDING = 1 LASTSCB
0x5 CURRSCB 
0x5 NEXTSCB 0x0 
Feb  7 00:13:28 qadb1 kernel: qinstart = 1 qinfifonext = 1 
Feb  7 00:13:28 qadb1 kernel: QINFIFO: 
Feb  7 00:13:28 qadb1 kernel: WAITING_TID_QUEUES: 
Feb  7 00:13:28 qadb1 kernel: Pending list: 
Feb 7 00:13:28 qadb1 kernel: 5 FIFO_USE[0x0] SCB_CONTROL[0x44]
SCB_SCSIID[0x17] 
Feb  7 00:13:28 qadb1 kernel: Total 1 
Feb  7 00:13:28 qadb1 syslog: klogd startup succeeded 
Feb  7 00:13:28 qadb1 kernel: Kernel Free SCB list: 4 3 2 1 0 
Feb  7 00:13:28 qadb1 kernel: Sequencer Complete DMA-inprog list: 
Feb  7 00:13:28 qadb1 kernel: Sequencer Complete list: 
Feb  7 00:13:28 qadb1 kernel: Sequencer DMA-Up and Complete list: 
Feb  7 00:13:28 qadb1 kernel: 
Feb  7 00:13:28 qadb1 kernel: scsi0: FIFO0 Free, LONGJMP == 0x80ff,
SCB 0x0 
Feb  7 00:13:28 qadb1 kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] 
DFSTATUS[0x89] 
Feb 7 00:13:28 qadb1 kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0]
DFFSXFRCTL[0x0] 
                                                                     
          
Feb  7 00:13:28 qadb1 kernel: SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR =
0x00, SHCNT = 
0x0 
Feb  7 00:13:28 qadb1 kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] 
Feb  7 00:13:28 qadb1 kernel: scsi0: FIFO1 Free, LONGJMP == 0x8072,
SCB 0x5 
Feb  7 00:13:28 qadb1 kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] 
DFSTATUS[0x88] 
Feb 7 00:13:28 qadb1 kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0]
DFFSXFRCTL[0x0] 
                                                                     
          
Feb  7 00:13:28 qadb1 irqbalance: irqbalance startup succeeded 
Feb 7 00:13:28 qadb1 kernel: SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR =
0x09e, SHCNT = 
0xffff62 
Feb  7 00:13:28 qadb1 kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] 
Feb  7 00:13:28 qadb1 kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0 
0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
Feb  7 00:13:28 qadb1 kernel: scsi0: LQISTATE = 0x0, LQOSTATE = 0x0,
OPTIONMODE 
= 0x42 
Feb  7 00:13:28 qadb1 kernel: scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 
Feb  7 00:13:28 qadb1 kernel: SIMODE0[0xc] 
Feb  7 00:13:28 qadb1 kernel: CCSCBCTL[0x4] 
Feb 7 00:13:28 qadb1 kernel: scsi0: REG0 == 0xff00, SINDEX = 0x1bc,
DINDEX = 
0x1ba 
Feb  7 00:13:28 qadb1 kernel: scsi0: SCBPTR == 0x27, SCB_NEXT == 0xff00, 
SCB_NEXT2 == 0x0 
Feb  7 00:13:28 qadb1 kernel: CDB 27 0 0 0 0 0 
Feb  7 00:13:28 qadb1 kernel: STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x29 0x14d 
Feb  7 00:13:28 qadb1 kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends 
>>>>>>>>>>>>>>>>>> 
Feb  7 00:13:28 qadb1 kernel: DevQ(0:1:0): 0 waiting 
Feb  7 00:13:28 qadb1 kernel: DevQ(0:1:231): 0 waiting 
Feb  7 00:13:28 qadb1 kernel: scsi0:A:1:39: Target did not send an
IDENTIFY 
message. LASTPHASE = 0x60. 
Feb  7 00:13:28 qadb1 kernel: scsi0: Issued Channel A Bus Reset. 1
SCBs aborted 
Feb 7 00:13:28 qadb1 kernel: scsi0:A:1: no active SCB for reconnecting
target - 
issuing BUS DEVICE RESET 
Feb  7 00:13:28 qadb1 kernel: SAVED_SCSIID == 0x17, SAVED_LUN == 0x28,
REG0 == 
0xff00 ACCUM = 0x0 
Feb  7 00:13:28 qadb1 kernel: SEQ_FLAGS == 0xc0, SCBPTR == 0x28, BTT
== 0xff00, 
SINDEX == 0x1bc 
Feb  7 00:13:28 qadb1 kernel: SELID == 0x10, SCB_SCSIID == 0x0,
SCB_LUN == 0x0, 
SCB_CONTROL == 0x0 
Feb  7 00:13:28 qadb1 kernel: SCSIBUS[0] == 0xb1, SCSISIGI == 0x65 
Feb  7 00:13:28 qadb1 kernel: SXFRCTL0 == 0x88 
Feb  7 00:13:28 qadb1 kernel: SEQCTL0 == 0x10 
Feb  7 00:13:28 qadb1 kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins 
<<<<<<<<<<<<<<<<< 
Feb 7 00:13:28 qadb1 kernel: scsi0: Dumping Card State at program
address 0x165 
Mode 0x33 
Feb  7 00:13:28 qadb1 kernel: Card was paused 
Feb  7 00:13:28 qadb1 kernel: HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] 
SAVED_MODE[0x11] 
Feb  7 00:13:28 qadb1 kernel: DFFSTAT[0x31] SCSISIGI[0x65] SCSIPHASE[0x2] 
SCSIBUS[0xb1] 
Feb  7 00:13:28 qadb1 portmap: portmap startup succeeded 
Feb  7 00:13:28 qadb1 kernel: LASTPHASE[0x60] SCSISEQ0[0x0]
SCSISEQ1[0x12] 
SEQCTL0[0x10] 
Feb  7 00:13:28 qadb1 kernel: SEQINTCTL[0x0] SEQ_FLAGS[0xc0]
SEQ_FLAGS2[0x0] 
SSTAT0[0x0] 
Feb 7 00:13:28 qadb1 kernel: SSTAT1[0x11] SSTAT2[0x0] SSTAT3[0x0]
PERRDIAG[0x0] 
Feb  7 00:13:28 qadb1 kernel: SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] 
LQISTAT2[0x0] 
Feb  7 00:13:28 qadb1 kernel: LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x80] 
Feb  7 00:13:28 qadb1 kernel: 
Feb 7 00:13:28 qadb1 kernel: SCB Count = 6 CMDS_PENDING = 1 LASTSCB
0x5 CURRSCB 
0x5 NEXTSCB 0x0 
Feb  7 00:13:28 qadb1 kernel: qinstart = 1 qinfifonext = 1 
Feb  7 00:13:28 qadb1 kernel: QINFIFO: 
Feb  7 00:13:28 qadb1 kernel: WAITING_TID_QUEUES: 
Feb  7 00:13:28 qadb1 kernel: Pending list: 
Feb 7 00:13:28 qadb1 kernel: 5 FIFO_USE[0x0] SCB_CONTROL[0x44]
SCB_SCSIID[0x17] 
Feb  7 00:13:28 qadb1 kernel: Total 1 
Feb  7 00:13:28 qadb1 kernel: Kernel Free SCB list: 4 3 2 1 0 
Feb  7 00:13:28 qadb1 kernel: Sequencer Complete DMA-inprog list: 
Feb  7 00:13:28 qadb1 kernel: Sequencer Complete list: 
Feb  7 00:13:28 qadb1 kernel: Sequencer DMA-Up and Complete list: 
Feb  7 00:13:28 qadb1 kernel: 
Feb  7 00:13:28 qadb1 kernel: scsi0: FIFO0 Free, LONGJMP == 0x80ff,
SCB 0x0 
Feb  7 00:13:28 qadb1 kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] 
DFSTATUS[0x89] 
Feb 7 00:13:28 qadb1 kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0]
DFFSXFRCTL[0x0] 
Feb  7 00:13:28 qadb1 kernel: SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR =
0x00, SHCNT = 
0x0 
Feb  7 00:13:28 qadb1 kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] 
Feb  7 00:13:28 qadb1 kernel: scsi0: FIFO1 Free, LONGJMP == 0x8072,
SCB 0x5 
Feb  7 00:13:28 qadb1 kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] 
DFSTATUS[0x88] 
Feb 7 00:13:28 qadb1 kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0]
DFFSXFRCTL[0x0] 
Feb 7 00:13:28 qadb1 kernel: SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR =
0x09e, SHCNT = 
0xffff62 
Feb  7 00:13:28 qadb1 kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] 
Feb  7 00:13:28 qadb1 rpc.statd[1345]: Version 1.0.5 Starting 
Feb  7 00:13:28 qadb1 kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0 
0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
Feb  7 00:13:28 qadb1 kernel: scsi0: LQISTATE = 0x0, LQOSTATE = 0x0,
OPTIONMODE 
= 0x42 
Feb  7 00:13:28 qadb1 kernel: scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 
Feb  7 00:13:28 qadb1 kernel: SIMODE0[0xc] 
Feb  7 00:13:28 qadb1 kernel: CCSCBCTL[0x4] 
Feb 7 00:13:28 qadb1 kernel: scsi0: REG0 == 0xff00, SINDEX = 0x1bc,
DINDEX = 
0x1ba 
Feb  7 00:13:28 qadb1 nfslock: rpc.statd startup succeeded 
Feb  7 00:13:28 qadb1 kernel: scsi0: SCBPTR == 0x28, SCB_NEXT == 0xff00, 
SCB_NEXT2 == 0x0 
Feb  7 00:13:28 qadb1 kernel: CDB 28 0 0 0 0 0 
Feb  7 00:13:28 qadb1 kernel: STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x29 0x14d 
Feb  7 00:13:28 qadb1 kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends 
>>>>>>>>>>>>>>>>>> 
Feb  7 00:13:28 qadb1 kernel: DevQ(0:1:0): 0 waiting 
Feb  7 00:13:28 qadb1 kernel: DevQ(0:1:232): 0 waiting 
Feb  7 00:13:28 qadb1 kernel: scsi0:A:1:40: Target did not send an
IDENTIFY 
message. LASTPHASE = 0x60. 
Feb  7 00:13:28 qadb1 kernel: scsi0: Issued Channel A Bus Reset. 1
SCBs aborted
Comment 1 Tom Coughlan 2004-12-21 16:34:46 EST
I apologize for the long delay in replying to this.

All the LUNs on the target are scanned if the storage device is listed
in the scsi_scan.c "whitelist", with the SPARSELUN flag set. This
overrides the setting of max_scsi_luns. 

If you post the output of dmesg showing the boot messages, or
/proc/scsi/scsi, I will check the list for your vendor-id and product-id. 

Aside from that, it appears that the adapter is having trouble
scanning the LUN space. I know of one problem with the current version
of the aic79xx driver that could be the cause. These drivers increase
the max LUN they scan from 64 to 256.  The problem occurs because on a
parallel SCSI bus, the driver must use the packetized protocol to
address LUNs > 63. The driver is not doing this, and instead, it is
using the non-packetized protocol for LUN 64 and above.  This can
cause a veriety of problems, including command timeouts, and
non-existant device configuration. 

If you are still having this problem, please post the vendor-id and
product-id information, and the full listing of errors. 
Comment 2 Tom Coughlan 2005-09-19 09:41:55 EDT
Since there are insufficient details provided in this report for us to
investigate the issue further, and we have not received the feedback we
requested, we will assume the problem was not reproduceable or has been fixed in
a later update for this product.

Users who have experienced this problem are encouraged to upgrade to the latest
update release, and if this issue is still reproducible, please contact the Red
Hat Global Support Services page on our website for technical support options:
https://www.redhat.com/support

If you have a telephone based support contract, you may contact Red Hat at
1-888-GO-REDHAT for technical support for the problem you are experiencing. 

Note You need to log in before you can comment on or make changes to this bug.