From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830 Description of problem: The 'dump' program hangs after displaying the following messages: # dump -0u -f /dev/st0 / DUMP: Date of this level 0 dump: Mon Oct 14 18:51:48 2002 DUMP: Dumping /dev/sda2 (/) to /dev/st0 DUMP: Added inode 8 to exclude list (journal inode) DUMP: Added inode 7 to exclude list (resize inode) DUMP: Label: / DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 1526917 tape blocks. DUMP: Volume 1 started with block 1 at: Mon Oct 14 18:51:51 2002 DUMP: dumping (Pass III) [directories] At this point, /var/log/messages contains the following lines: Oct 14 18:51:51 localhost kernel: scsi0:A:5: Missed busfree. Lastphase = 0xe0, Curphase = 0x0 Oct 14 18:51:51 localhost last message repeated 2 times Oct 14 18:51:51 localhost kernel: scsi0: Missing case in ahc_handle_scsiint. status = 8 A check of /lib/modules/2.4.18-14/kernel/drivers/scsi/aic7xxx/aic7xxx.o shows that it is the source of those error messages. # lsmod | grep aic aic7xxx 137140 1 scsi_mod 107144 4 [st ips aic7xxx sd_mod] Version-Release number of selected component (if applicable): # rpm -q dump kernel dump-0.4b28-4 kernel-2.4.18-14 How reproducible: Always Steps to Reproduce: 1. Before starting Linux, set the SCSI data transfer rate of Adaptec onboard SCSI card to 'ASYNC' for the DLT tape drive. The default rate, 160Mb/sec, is too high for the tape drive, which will only accept data at approximately 6Mb/sec. 2.Insert tape into the DLT tape drive. 3.Boot Linux and log on as 'root'. 4.At the shell prompt, run the 'dump' command: # dump -0u -f /dev/st0 / Actual Results: 'dump' does not display any messages after the messages above, even after a period of 30 minutes. /var/log/messages: ------------------ Oct 14 14:55:12 localhost kernel: (scsi0:A:5): 6.600MB/s transfers (16bit) Oct 14 14:55:12 localhost kernel: st0: Block limits 2 - 16777214 bytes. (text omitted) Oct 14 18:51:51 localhost kernel: scsi0:A:5: Missed busfree. Lastphase = 0xe0, Curphase = 0x0 Oct 14 18:51:51 localhost last message repeated 2 times Oct 14 18:51:51 localhost kernel: scsi0: Missing case in ahc_handle_scsiint. status = 8 Expected Results: The 'dump' program is supposed to copy the entire contents of the '/' directory tree to the backup tape. Additional info: From /var/log/dmesg: -------------------- SCSI subsystem driver Revision: 1.00 kmod: failed to exec /sbin/modprobe -s -k scsi_hostadapter, errno = 2 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs blk: queue c3620e14, I/O limit 4095Mb (mask 0xffffffff) Vendor: QUANTUM Model: DLT8000 Rev: 0250 Type: Sequential-Access ANSI SCSI revision: 02 blk: queue f7fbea14, I/O limit 4095Mb (mask 0xffffffff) blk: queue f7fbea14, I/O limit 4095Mb (mask 0xffffffff) scsi2 : IBM PCI ServeRAID 5.10.21 <ServeRAID 4Lx> Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Processor ANSI SCSI revision: 02 Vendor: IBM Model: YGLv3 S2 Rev: 0 Type: Processor ANSI SCSI revision: 02 Attached scsi disk sda at scsi2, channel 0, id 0, lun 0 SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) Partition check: sda: sda1 sda2 sda3 Journalled Block Device driver loaded (text omitted) EXT3 FS 2.4-0.9.18, 14 May 2002 on sd(8,2), internal journal Adding Swap: 2040244k swap-space (priority -1) kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.18, 14 May 2002 on sd(8,1), internal journal EXT3-fs: mounted filesystem with ordered data mode. st: Version 20020205, bufsize 32768, wrt 30720, max init. bufs 4, s/g segs 16 Attached scsi tape st0 at scsi0, channel 0, id 5, lun 0 From /var/log/messages: (prior to starting 'dump') ----------------------- Oct 14 14:50:27 localhost kernel: EXT3 FS 2.4-0.9.18, 14 May 2002 on sd(8,1), internal journal Oct 14 14:50:27 localhost kernel: EXT3-fs: mounted filesystem with ordered data mode. Oct 14 14:50:27 localhost kernel: st: Version 20020205, bufsize 32768, wrt 30720, max init. bufs 4, s/g segs 16 Oct 14 14:50:27 localhost kernel: Attached scsi tape st0 at scsi0, channel 0, id 5, lun 0 Oct 14 14:50:27 localhost kernel: parport0: PC-style at 0x378 [PCSPP] Oct 14 14:50:27 localhost kernel: ohci1394: pci_module_init failed
I reported this problem to the 'dump' mailing list (dump-users), and got the following reply from dump's maintainer: > I am having a problem getting 'dump' to work. > It hangs shortly after starting. [...] > Oct 9 17:00:02 localhost kernel: scsi0:A:5: Missed busfree. Lastphase = > 0xe0, Curphase = 0x0 > Oct 9 17:00:03 localhost last message repeated 3 times > Oct 9 17:00:03 localhost kernel: scsi0: Missing case in > ahc_handle_scsiint. status = 8 This is not a dump issue but a kernel issue. For some reason, the big amount of data dump is trying to send to your tape drive causes problems in the kernel's SCSI subsystem. Be sure to report this to the maintainer of this specific SCSI driver and/or to the linux-kernel mailing list. Thanks, Stelian. -- Stelian Pop <stelian.pop.com> Alcove - http://www.alcove.com
I tried using the older aix7xxx module, 'aix7xxx_old', but got the same result, that is, 'dump' hangs after printing the line that starts with 'DUMP: Volume 1 started with block 1 at:...". Here are the steps I took (as 'root'): # vi /etc/modules.conf Replace 'alias scsi_hostadapter aic7xxx' with 'alias scsi_hostadapter aic7xxx_old' # /sbin/rmmod aic7xxx # /sbin/modprobe -s -k aic7xxx_old # /sbin/lsmod | grep aic7xxx aic7xxx_old 125664 1 (autoclean) scsi_mod 107144 4 [aic7xxx_old st ips sd_mod] # /sbin/dump -0u -f /dev/st0 / DUMP: Date of this level 0 dump: Tue Oct 15 12:18:12 2002 DUMP: Dumping /dev/sda2 (/) to /dev/st0 DUMP: Added inode 8 to exclude list (journal inode) DUMP: Added inode 7 to exclude list (resize inode) DUMP: Label: / DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 1527330 tape blocks. DUMP: Volume 1 started with block 1 at: Tue Oct 15 12:18:15 2002 ('dump' hangs until Ctrl-C is pressed) aic7xxx_old is less informative about what is causing the problem; there are no messages in /var/log/messages after: Oct 15 12:18:15 localhost kernel: st0: Block limits 2 - 16777214 bytes.
I upgraded the kernel on my computer to the latest patch level, 2.4.18-17.8.0, and retried backing up my RAID drive to tape. The 'dump' utility fails/hangs again in the same location and the SCSI driver displays the same messages in /var/log/messages: localhost kernel: scsi0:A:5: Missed busfree. Lastphase = 0xe0, Curphase = 0x0 localhost kernel: scsi0: Missing case in ahc_handle_scsiint. status = 8
I might be able to provide more information about this problem if someone at Red Hat could tell me the following: 1. Where to get the kernel 2.4.18 source. I looked at www.kernel.org. It listed 2.4.19. 2. The exact steps for patching the source with the kernel source .rpm provides (essentially, it's a big patch file). Probably # patch -p0 < patchfile? I am assuming that I can use the kernel-2.4.18-17.8.0.src.rpm for patching 2.4.18. 3. The steps Red Hat takes to build the kernel (what optimization flags, etc.) Or, if I don't need all of the kernel, then the steps to build the device driver aic7xxx.o. 4. The steps needed to install a new kernel over a running kernel. 5. Any help you can give me with these steps will help me to produce a valid, debuggable aic7xxx.o file.
Please disregard my most recent questions. The answers are provided in the Red Hat Linux 8.0 Customization Guide, Appendix A. Building a Custom Kernel. After I installed the kernel-source .rpm file on CD 2 and ran Red Hat Update Agent, I now have the source code for kernel 2.4.18-17.8.0 installed.
This may not be a problem with anything Linux. We are witnessing tape issues under FreeBSD, Windows, and Solaris when the tape drive is connected to any Adaptec U160 (aic789x) controller. In some instances, we actually witness the system hang during BIOS POST, long before any Linux elements come into play. Tested controllers include 29160, 39160, embedded 7899 (Dell and Supermicro) on Intel 810, 815, and 845, Asus, FIC, MSI, and Shuttle mainboards. Tests with other U160 controller have resulted in proper operations (Symbios and LSI). This has been reported to both the tape drive manufacturers and to Adaptec, but no solution seems readily forthcoming. Our recommendation is to add a non-789x controller (Adaptec 7880 cards, Advansys, ACCard, Symbios, Initiao, et al) to the system for the tape drive. One easy fix is the SiiG AP40 from sources like Microcenter and CompUSA. US$70 gets you an Ultrawide 40MB/sec controller that uses the ACCard artp870u driver. Tim
FYI, I rebuilt the kernel including the (diagnostic) changes to aic7xxx_core.c that were requested by the maintainer of this driver, Justin Gibbs. Below are the results that I sent to him on 25 October 2002. I haven't heard a reply (I sent the message a second time a few weeks later). Justin, Below is the log from /var/log/messages after I added the diagnostic code that you requested to aic7xxx_core.c and rebuilt the kernel. (Please reply to this message to let me know that you got it.) Thanks for your help with this problem! -mark Oct 25 18:35:53 localhost kernel: (scsi0:A:5): 6.600MB/s transfers (16bit) Oct 25 18:35:53 localhost kernel: st0: Block limits 2 - 16777214 bytes. Oct 25 18:35:53 localhost kernel: scsi0:A:5: Missed busfree. Lastphase = 0xe0, Curphase = 0x0 Oct 25 18:35:53 localhost kernel: scsi0: Missing case in ahc_handle_scsiint. status = 8 Oct 25 18:35:53 localhost kernel: scsi0: Dumping Card State while idle, at SEQADDR 0x1 Oct 25 18:35:53 localhost kernel: ACCUM = 0x0, SINDEX = 0xb1, DINDEX = 0xe4, ARG_2 = 0x0 Oct 25 18:35:53 localhost kernel: HCNT = 0x0 SCBPTR = 0x0 Oct 25 18:35:53 localhost kernel: SCSISEQ = 0x12, SBLKCTL = 0xa Oct 25 18:35:53 localhost kernel: DFCNTRL = 0x4, DFSTATUS = 0x89 Oct 25 18:35:53 localhost kernel: LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80 Oct 25 18:35:53 localhost kernel: SSTAT0 = 0x0, SSTAT1 = 0x9 Oct 25 18:35:53 localhost kernel: SCSIPHASE = 0x0 Oct 25 18:35:53 localhost kernel: STACK == 0x43, 0x0, 0x160, 0x108 Oct 25 18:35:53 localhost kernel: SCB count = 4 Oct 25 18:35:53 localhost kernel: Kernel NEXTQSCB = 3 Oct 25 18:35:53 localhost kernel: Card NEXTQSCB = 3 Oct 25 18:35:53 localhost kernel: QINFIFO entries: Oct 25 18:35:53 localhost kernel: Waiting Queue entries: Oct 25 18:35:53 localhost kernel: Disconnected Queue entries: Oct 25 18:35:53 localhost kernel: QOUTFIFO entries: Oct 25 18:35:53 localhost kernel: Sequencer Free SCB List: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Oct 25 18:35:53 localhost kernel: Sequencer SCB Info: 0(c 0x40, s 0x57, l 0, t 0xff) 1(c 0x0, s 0xff, l 255, t 0xff) 2(c 0x0, s 0xff, l 255, t 0xff) 3(c 0x0, s 0xff, l 255, t 0xff) 4(c 0x0, s 0xff, l 255, t 0xff) 5(c 0x0, s 0xff, l 255, t 0xff) 6(c 0x0, s 0xff, l 255, t 0xff) 7(c 0x0, s 0xff, l 255, t 0xff) 8(c 0x0, s 0xff, l 255, t 0xff) 9(c 0x0, s 0xff, l 255, t 0xff) 10(c 0x0, s 0xff, l 255, t 0xff) 11(c 0x0, s 0xff, l 255, t 0xff) 12(c 0x0, s 0xff, l 255, t 0xff) 13(c 0x0, s 0xff, l 255, t 0xff) 14(c 0x0, s 0xff, l 255, t 0xff) 15(c 0x0, s 0xff, l 255, t 0xff) 16(c 0x0, s 0xff, l 255, t 0xff) 17(c 0x0, s 0xff, l 255, t 0xff) 18(c 0x0, s 0xff, l 255, t 0xff) 19(c 0x0, s 0xff, l 255, t 0xff) 20(c 0x0, s 0xff, l 255, t 0xff) 21(c 0x0, s 0xff, l 255, t 0xff) 22(c 0x0, s 0xff, l 255, t 0xff) 23(c 0x0, s 0xff, l 255, t 0xff) 24(c 0x0, s 0xff, l 255, t 0xff) 25(c 0x0, s 0xff, l 255, t 0xff) 26(c 0x0, s 0xff, l 255, t 0xff) 27(c 0x0, s 0xff, l 255, t 0xff) 28(c 0x0, s 0xff, l 255, t 0xff) 29(c 0x0, s 0xff, l 255, t 0x Oct 25 18:35:53 localhost kernel: f) 30(c 0x0, s 0xff, l 255, t 0xff) 31(c 0x0, s 0xff, l 255, t 0xff) Oct 25 18:35:53 localhost kernel: Pending list: Oct 25 18:35:53 localhost kernel: Kernel Free SCB list: 2 1 0 Oct 25 18:35:53 localhost kernel: DevQ(0:5:0): 0 waiting' > -----Original Message----- > From: Gibbs, Justin [Justin_Gibbs] > Sent: Tuesday, October 22, 2002 1:30 PM > To: Harig, Mark A. > Subject: RE: Problem with Adaptec aic7899 Ultra160 SCSI adapter driver > > > Mark, > > This is going to be a bit tricky to debug remotely. One thing you can > do for me is to insert a call to "ahc_dump_card_state(ahc);" > right below > the line in > drivers/scsi/aic7xxx/aic7xxx_core.c:ahc_handle_scsiint where > it says: > > printf("%s: Missing case in ahc_handle_scsiint. status > = 0x%x\n", > ahc_name(ahc), status); > > It's near the end of the function. > > BTW, just because your tape drive can only stream at 6.6MB/s, > you don't > need to set the sync rate speed that low. The drive will > actually burst > transfers across the SCSI bus at much higher rates. > > We have a DLT4000 here that I'm trying to use to reproduce > your problem. > I'll let you know what I find. > > -- > Justin >
IBM has released a new version of their UpdateXpress CD (version 2.03). This includes updates to the POST/BIOS firmware and to the ServeRAID firmware (to version 6.00). These updates appear to have fixed the problem that I reported originally, i.e., 'dump' no longer hangs. However, during the boot process the ips.o device driver reports sever warning messages (recorded in /var/log/dmesg): Warning: Adapter 0 Firmware Compatible version is MR600, but should be SA510 Warning: Adapter 0 BIOS Compatible version is MR600, but should be SA510 Warning: ! ! ! ServeRAID Version mismatch I examined the source code for ips.c that IBM provides on the UpdateXpress CD and compared it with the source code that is provided with the latest Red Hat 8.0 kernel version, 2.4.20-18.8. According to the changelogs at the top of the source files, the Red Hat 8.0 version of ips.c is version 5.00.01 while the IBM version of ips.c is 6.00.00 and includes for three other versions that the Red Hat version does not. Are there any plans to include version 6.00.00 of ips.c in a future release of the Red Hat kernels?
Created attachment 93279 [details] The ChangeLog for the 'ips' SCSI device driver
Created attachment 93280 [details] The source file 'ips.c' for version 6.00.00 of the 'ips' device driver.
Created attachment 93281 [details] The source file 'ips.h' for version 6.00.00 of the 'ips' device driver I have rebuilt the kernel using the source code for versions 2.4.20-18.8 and 2.4.20-19.9 and these attached 'ips' device driver files without any problems.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/