Description of problem: Hi, I am getting kernel panic under following configuration: Machine: DL580 multiprocessor Intel Xeon 2GHZ processors(2); 32-bit OS: Redhat Linux Advanced Server 2.1 Kernel Version:2.4.9-e25smp HBA: Qlogic2342 FC HBA Device connected: Ultrium FC interface tape drive. Info of HBA from /proc/scsi/qla2300/0 : QLogic PCI to Fibre Channel Host Adapter for QLA23xx: Firmware version: 3.02.16, Driver version 6.06.10 Entry address = f8881060 HBA: QLA2312 , Serial# A00000 Request Queue = 0x36afc000, Response Queue = 0x36ae0000 Request Queue count= 128, Response Queue count= 512 Total number of active commands = 0 Total number of interrupts = 20146612 Total number of IOCBs (used/max) = (0/600) Total number of queued commands = 0 Device queue depth = 0x20 Number of free request entries = 52 Number of mailbox timeouts = 0 Number of ISP aborts = 0 Number of loop resyncs = 0 Number of retries for empty slots = 0 Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0 Host adapter:loop state= <READY>, flags= 0x2820813 Dpc flags = 0x0 MBX flags = 0x0 SRB Free Count = 4096 Link down Timeout = 060 Port down retry = 008 Login retry count = 008 Commands retried with dropped frame(s) = 0 SCSI Device Information: scsi-qla0-adapter-node=200000e08b000000; scsi-qla0-adapter-port=200000e08b000000; scsi-qla0-target-0=100000e00222623c; scsi-qla0-target-1=100000e00242623c; SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 0): Total reqs 95, Pending reqs 0, flags 0x0, 0:0:81, ( 0: 1): Total reqs 1622, Pending reqs 0, flags 0x0, 0:0:81, ( 0: 2): Total reqs 189, Pending reqs 0, flags 0x0, 0:0:81, ( 0: 3): Total reqs 4694518, Pending reqs 0, flags 0x0, 0:0:81, ( 1: 0): Total reqs 95, Pending reqs 0, flags 0x0, 0:0:82, ( 1: 1): Total reqs 12850865, Pending reqs 0, flags 0x0, 0:0:82, ( 1: 2): Total reqs 2602072, Pending reqs 0, flags 0x0, 0:0:82, Kernel panic information: kernel BUG at /usr/src/linux-2.4/include/asm/pci.h:145! invalid operand: 0000 Kernel 2.4.9-e.25smp CPU: 2 EIP: 0010:[<f8891658>] Tainted: P EFLAGS: 00010086 EIP is at qla2x00_64bit_start_scsi [qla2300] 0x498 eax: 0000003b ebx: 00000002 ecx: c02f6744 edx: 000ce00a esi: c7fb4000 edi: 00000011 ebp: f6afd868 esp: df15bc18 ds: 0018 es: 0018 ss: 0018 Process hp_ltt (pid: 26018, stackpage=df15b000) Stack: f88a2f00 00000091 00000202 c02f87cc 00000000 c02f8394 c02f8394 c02f8d40 00000000 000000f0 00004000 f6afd840 00000002 00000000 00000001 0000006f f55cf000 f6c20084 00000046 c02f8394 c02f8d40 00000000 00000020 f6afd840 Call Trace: [<f88a2f00>] .LC2 [qla2300] 0x0 [<f889a0f6>] qla2x00_next [qla2300] 0x206 [<f888bfbd>] qla2x00_queuecommand [qla2300] 0x3dd [<f8800690>] scsi_dispatch_cmd [scsi_mod] 0x150 [<f8800d50>] scsi_done [scsi_mod] 0x0 [<f880878e>] scsi_request_fn [scsi_mod] 0x31e [<f8807a34>] __scsi_insert_special [scsi_mod] 0x74 [<f8807a9a>] scsi_insert_special_req [scsi_mod] 0x1a [<f880097b>] scsi_do_req_Rsmp_bdc72156 [scsi_mod] 0x14b [<f89ede40>] sg_cmd_done_bh [sg] 0x0 [<f89ed05b>] sg_common_write [sg] 0x23b [<f89ede40>] sg_cmd_done_bh [sg] 0x0 [<f89ecdee>] sg_new_write [sg] 0x1ce [<f89ef9a1>] sg_build_reserve [sg] 0x51 [<f89ed330>] sg_ioctl [sg] 0x2b0 [<c013da56>] _wrapped_alloc_pages [kernel] 0x76 [<c012c612>] do_wp_page [kernel] 0x172 [<c012d19b>] do_no_page [kernel] 0x3b [<c012d5f0>] handle_mm_fault [kernel] 0xf0 [<c0125893>] collect_signal [kernel] 0x93 [<c0117f80>] do_page_fault [kernel] 0x0 [<c0118126>] do_page_fault [kernel] 0x1a6 [<c0107086>] do_signal [kernel] 0x66 [<c0126ed3>] sys_rt_sigaction [kernel] 0x93 [<c0155887>] sys_ioctl [kernel] 0x257 [<c01073c3>] system_call [kernel] 0x33 Code: 0f 0b 8d b6 00 00 00 00 83 c4 08 8d 04 5b 8d 14 c5 00 00 00 <0>Kernel panic: not continuing Could you please let me know the cause for this problem as early as possible..... Version-Release number of selected component (if applicable): Linux : 2.4.9-e25smp 2.4.9-e27smp How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Kernel oops -> changing component to kernel.
Related conversation on linux-kernel and linux-scsi: From: Andrew Vasquez <praka.net> To: linux-kernel.org, linux-scsi.org Cc: veeresh <vanami.com> Subject: Re: Kernel panic on Redhat Linux AS2.1 with QLogic 2342 HBA Date: Tue, 10 Feb 2004 10:25:13 -0800 On Tue, 10 Feb 2004, veeresh One of the scatter-gather entries of a SCSI command was NULL. Is any of the software you are running preparing SCSI commands and sending them down via SG perhaps? What type of I/O is occuring when the failure occurs? From: veeresh <vanami.com> Reply-To: vanami.com To: 'Andrew Vasquez' <praka.net>, linux-kernel.org, linux-scsi.org Cc: SAIDU BUHARI (HP-India,ex2) (E-mail) <saidu.buhari> Subject: RE: Kernel panic on Redhat Linux AS2.1 with QLogic 2342 HBA Date: Wed, 11 Feb 2004 18:51:56 +0530 Hi Andrew,Thanks for your quick response..Yes, the software prepares SCSI commands and sends them down via generic device driver SG. In our case the device file for related to device connected(LTO-2 FC interface tape drive) is "/dev/sg0". I have logged list of CDB the software sends to a device using SG driver and the attached the same. The log shows that the last CDB was Receive diagnostic command. I ran multiple times the software, but the SCSI command that failed was the same. Please let me know if you need any information. -------------- The driver is detecting an invalid scatter-gather list. This may be caused by the application that generates SCSI commands and passes them to the kernel through SG driver. Please review your application's use of data buffers and scatter gather. What is causeing the "Tainted: P" status? Please reproduce the problem on an untainted kernel.
Hi Tom, In our applications we are not using the scatter-gather list while passing commands to the kenel through SG driver. We set the iovec_count member of struct sg_io_hdr to zero for all the SCSI commands passed to kernel through SG driver. I reviewed the application for data buffers usage, it looks every thing is fine as we set the data buffers only for those SCSI commands that invloves data transfer from/to device. I am wondering how the entry in the scatter-gather list is NULL, whose is creating the list and under what circumstances that entry is becoming NULL? Regards, Veeresh
Is there any new status on this problem? I see from the output above that you are using driver version 6.06.10, while the driver we ship with the kernel is 6.04.01. I also see the kernel is tainted. Can you reproduce this with the 6.04.01 driver and a non-tainted kernel? Or have you solved the problem with a newer driver, or a fix to your application?
Since there are insufficient details provided in this report for us to investigate the issue further, and we have not received the feedback we requested, we will assume the problem was not reproduceable or has been fixed in a later update for this product. Users who have experienced this problem are encouraged to upgrade to the latest update release, and if this issue is still reproduceable, please contact the Red Hat Global Support Services page on our website for technical support options: https://www.redhat.com/support If you have a telephone based support contract, you may contact Red Hat at 1-888-GO-REDHAT for technical support for the problem you are experiencing.