From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Description of problem: While compressing data to the IDE Tape device Travan 40 that is using the ide-scsi driver, the system stops responding (kernel panic). This has been seen on a Power Edge 1600 and 600SC. Version-Release number of selected component (if applicable): kernel-2.6.9-5 How reproducible: Always Steps to Reproduce: Steps to reproduce: 1. Install RHEL4 RC1 on either of the 2 platforms with Travan40 Tape Drive. 2. After successful installation edit grub.conf by appending "hdX=ide-scsi" at the end of the kernel parameter line. 3. Type in a few commands such as "mt -f /dev/st0 status", "mt -f /dev/st0 rewind", "mt -f /dev/st0 eject": works fine 5. Now type the command "tar -cvf /dev/st0 /root" : This results in kernel panic.(outputs attached) Actual Results: Kernel Oops followed by a panic Additional info:
Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c026152d *pde = 00000000 Oops: 0000 [#1] Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables st ide_scsi button battery ac ohci_hcd tg3 e1000 floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod megaraid_mbox megaraid_mm aic7xxx sd_mod scsi_mod CPU: 0 EIP: 0060:[<c026152d>] Not tainted VLI EFLAGS: 00010046 (2.6.9-5.EL) EIP is at ide_outsw+0x5/0xa eax: 000001e8 ebx: 00001001 ecx: 00000800 edx: 000001e8 esi: 00000000 edi: 00000400 ebp: c0421fb8 esp: c03a6f08 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c03a6000 task=c0349bc0) Stack: c0422064 c02618fd 00000000 00001001 c0422064 00000000 c0421fb8 c026195b 00001000 f64e6100 00001800 c0422064 f890d12f 00000058 f64e6100 c0422064 00002800 f890d936 c03a6f70 00000000 c2f2c180 f76cb000 c0422064 f7ecf200 Call Trace: [<c02618fd>] ata_output_data+0x60/0x66 [<c026195b>] atapi_output_bytes+0x19/0x3f [<f890d12f>] idescsi_output_buffers+0x6a/0x8c [ide_scsi] [<f890d936>] idescsi_pc_intr+0x1b4/0x204 [ide_scsi] [<c02611b3>] ide_intr+0x263/0x396 [<f890d782>] idescsi_pc_intr+0x0/0x204 [ide_scsi] [<c0107b97>] handle_IRQ_event+0x25/0x4f [<c01084d1>] do_IRQ+0x11c/0x242 [<c0301d40>] common_interrupt+0x18/0x20 [<c010403b>] default_idle+0x23/0x26 [<c010408c>] cpu_idle+0x1f/0x34 [<c03a76b4>] start_kernel+0x20f/0x211 Code: c3 57 89 d7 89 c2 f3 66 6d 5f c3 89 c2 ed c3 57 89 d7 89 c2 f3 6d 5f c3 ee c3 89 d0 89 ca ee c3 0f b7 c0 66 ef c3 56 89 d6 89 c2 <f3> 66 6f 5e c3 ef c3 56 89 d6 89 c2 f3 6f 5e c3 c7 80 f8 04 00 <0>Kernel panic - not syncing: Fatal exception in interrupt Badness in panic at kernel/panic.c:117 [<c011f5c3>] panic+0x130/0x13d [<c01067be>] die+0x224/0x22b [<c01195b4>] do_page_fault+0x380/0x4dc [<c026152d>] ide_outsw+0x5/0xa [<c0144a13>] mempool_free+0x11f/0x122 [<c01675d2>] bio_put+0x27/0x28 [<c0119234>] do_page_fault+0x0/0x4dc [<c0301d7f>] error_code+0x2f/0x38 [<c026152d>] ide_outsw+0x5/0xa [<c02618fd>] ata_output_data+0x60/0x66 [<c026195b>] atapi_output_bytes+0x19/0x3f [<f890d12f>] idescsi_output_buffers+0x6a/0x8c [ide_scsi] [<f890d936>] idescsi_pc_intr+0x1b4/0x204 [ide_scsi] [<c02611b3>] ide_intr+0x263/0x396 [<f890d782>] idescsi_pc_intr+0x0/0x204 [ide_scsi] [<c0107b97>] handle_IRQ_event+0x25/0x4f [<c01084d1>] do_IRQ+0x11c/0x242 [<c0301d40>] common_interrupt+0x18/0x20 [<c010403b>] default_idle+0x23/0x26 [<c010408c>] cpu_idle+0x1f/0x34 [<c03a76b4>] start_kernel+0x20f/0x211 Badness in i8042_panic_blink at drivers/input/serio/i8042.c:983 [<c0234d22>] i8042_panic_blink+0xc0/0x1a0 [<c011f585>] panic+0xf2/0x13d [<c01067be>] die+0x224/0x22b [<c01195b4>] do_page_fault+0x380/0x4dc [<c026152d>] ide_outsw+0x5/0xa [<c0144a13>] mempool_free+0x11f/0x122 [<c01675d2>] bio_put+0x27/0x28 [<c0119234>] do_page_fault+0x0/0x4dc [<c0301d7f>] error_code+0x2f/0x38 [<c026152d>] ide_outsw+0x5/0xa [<c02618fd>] ata_output_data+0x60/0x66 [<c026195b>] atapi_output_bytes+0x19/0x3f [<f890d12f>] idescsi_output_buffers+0x6a/0x8c [ide_scsi] [<f890d936>] idescsi_pc_intr+0x1b4/0x204 [ide_scsi] [<c02611b3>] ide_intr+0x263/0x396 [<f890d782>] idescsi_pc_intr+0x0/0x204 [ide_scsi] [<c0107b97>] handle_IRQ_event+0x25/0x4f [<c01084d1>] do_IRQ+0x11c/0x242 [<c0301d40>] common_interrupt+0x18/0x20 [<c010403b>] default_idle+0x23/0x26 [<c010408c>] cpu_idle+0x1f/0x34 [<c03a76b4>] start_kernel+0x20f/0x211 Badness in i8042_panic_blink at drivers/input/serio/i8042.c:986 [<c0234d97>] i8042_panic_blink+0x135/0x1a0 [<c011f585>] panic+0xf2/0x13d [<c01067be>] die+0x224/0x22b [<c01195b4>] do_page_fault+0x380/0x4dc [<c026152d>] ide_outsw+0x5/0xa [<c0144a13>] mempool_free+0x11f/0x122 [<c01675d2>] bio_put+0x27/0x28 [<c0119234>] do_page_fault+0x0/0x4dc [<c0301d7f>] error_code+0x2f/0x38 [<c026152d>] ide_outsw+0x5/0xa [<c02618fd>] ata_output_data+0x60/0x66 [<c026195b>] atapi_output_bytes+0x19/0x3f [<f890d12f>] idescsi_output_buffers+0x6a/0x8c [ide_scsi] [<f890d936>] idescsi_pc_intr+0x1b4/0x204 [ide_scsi] [<c02611b3>] ide_intr+0x263/0x396 [<f890d782>] idescsi_pc_intr+0x0/0x204 [ide_scsi] [<c0107b97>] handle_IRQ_event+0x25/0x4f [<c01084d1>] do_IRQ+0x11c/0x242 [<c0301d40>] common_interrupt+0x18/0x20 [<c010403b>] default_idle+0x23/0x26 [<c010408c>] cpu_idle+0x1f/0x34 [<c03a76b4>] start_kernel+0x20f/0x211 Badness in i8042_panic_blink at drivers/input/serio/i8042.c:988 [<c0234dea>] i8042_panic_blink+0x188/0x1a0 [<c011f585>] panic+0xf2/0x13d [<c01067be>] die+0x224/0x22b [<c01195b4>] do_page_fault+0x380/0x4dc [<c026152d>] ide_outsw+0x5/0xa [<c0144a13>] mempool_free+0x11f/0x122 [<c01675d2>] bio_put+0x27/0x28 [<c0119234>] do_page_fault+0x0/0x4dc [<c0301d7f>] error_code+0x2f/0x38 [<c026152d>] ide_outsw+0x5/0xa [<c02618fd>] ata_output_data+0x60/0x66 [<c026195b>] atapi_output_bytes+0x19/0x3f [<f890d12f>] idescsi_output_buffers+0x6a/0x8c [ide_scsi] [<f890d936>] idescsi_pc_intr+0x1b4/0x204 [ide_scsi] [<c02611b3>] ide_intr+0x263/0x396 [<f890d782>] idescsi_pc_intr+0x0/0x204 [ide_scsi] [<c0107b97>] handle_IRQ_event+0x25/0x4f [<c01084d1>] do_IRQ+0x11c/0x242 [<c0301d40>] common_interrupt+0x18/0x20 [<c010403b>] default_idle+0x23/0x26 [<c010408c>] cpu_idle+0x1f/0x34 [<c03a76b4>] start_kernel+0x20f/0x211
I have the same problem on RHEL 4 ES with kernel 2.6.9-5.0.3.EL. I can manipulate the tape with mt, and sometimes get sense errors. Everything was working fine with RHEL 3 ES until I tried an upgrade last weekend.
Created attachment 111642 [details] sysreport
I have the same problem (with RH ES 4.0 kernel 2.6.9-5.0.3) on a poweredge 800. the command mt (erase, eject ..) and seagate tools (TapeRx.lx) works fine. But i have a kernel panic with the tar cvf /dev/st0 /xyz command. ls /proc/ide ==> i see hda and hdb but with ls /dev/hd* i see hda (cdrom) but hdb is missing.
Strangely enough, the tar completes without kernel panic in X windows (true 3/3 times). If you try on a console I get a kernel panic every time. We've looked at the scatter gather list which appears to be using sane addresses. The only difference between a pass and fail is that the fail uses a page address at c17XXXXX and a passing tar uses page addresses c16XXXXX. Dunno if its related to the failure, but the X vs Console thing may wake someone up.
It seems that if the sg pages are in high memory, idescsi_output_buffers() will pass a null pointer to atapi_output_bytes() because the sg pages have not been mapped into a kernel virtual memory address. (DMA is not being used with this drive/controller combo.) Maybe ide-scsi.c needs to make sure the sg pages are all mapped into kernel virtual memory...?
Like this (I'll attach as a file, too). It seems to be working so far. --- ide-scsi.c.orig 2005-03-08 10:53:41.000000000 -0500 +++ ide-scsi.c.new 2005-03-08 10:56:54.689852664 -0500 @@ -169,8 +169,9 @@ static void idescsi_output_buffers (ide_ return; } count = min(pc->sg->length - pc->b_count, bcount); - buf = page_address(pc->sg->page) + pc->sg->offset; + buf = kmap_atomic(pc->sg->page, KM_USER0) + pc->sg->offset; atapi_output_bytes (drive, buf + pc->b_count, count); + kunmap_atomic(buf - pc->sg->offset, KM_USER0); bcount -= count; pc->b_count += count; if (pc->b_count == pc->sg->length) { pc->sg++; @@ -457,7 +458,6 @@ static ide_startstop_t idescsi_pc_intr ( atapi_feature_t feature; unsigned int temp; - #if IDESCSI_DEBUG_LOG printk (KERN_INFO "ide-scsi: Reached idescsi_pc_intr interrupt handler\n"); #endif /* IDESCSI_DEBUG_LOG */
Created attachment 111784 [details] patch to make ide-scsi map sg pages to kernel space
Created attachment 111788 [details] patch to map sg in both input and output functions This patch fixes both idescsi_output_buffers and idescsi_input_buffers. We could get the NULL pointer dereference in either routine. We'll submit this patch for 2.6.11.
Created attachment 111821 [details] ide-scsi patch to map sg pages to kernel virtual space I forgot that IDE devices can have interrupts re-enabled before their handlers are called... had to modify the patch to mask local interrupts while page is "kmap_atomic"ed.
Dell will DKMS this fix in the IDE SCSI module in the U1 timeframe; it's on the MUSTFIX list for U2.
The *only* response I could get on the linux-scsi mailing list was from Jens Axboe, who thought that the patch above would work, but preferred to make ide- scsi use bounce buffers for any high memory, with a patch similar to the one at the end of this message (it's for 2.6.11). I'm still partial to the kmap_atomic() patch above, personally. Obviously we're using the kmap_atomic() patch above for the DKMS driver, since the patch below requires changes to files other than ide-scsi.c. I don't know why the SCSI maintainer won't comment on these patches... diff -purN a/drivers/scsi/ide-scsi.c b/drivers/scsi/ide-scsi.c --- a/drivers/scsi/ide-scsi.c 2005-03-25 22:28:23.000000000 -0500 +++ b/drivers/scsi/ide-scsi.c 2005-03-31 15:17:38.000000000 -0500 @@ -1048,6 +1048,7 @@ static int idescsi_attach(ide_drive_t *d return 1; host->max_id = 1; + host->bounce_high = 1; #if IDESCSI_DEBUG_LOG if (drive->id->last_lun) diff -purN a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c 2005-03-25 22:28:21.000000000 -0500 +++ b/drivers/scsi/scsi_lib.c 2005-03-31 15:21:35.000000000 -0500 @@ -1344,6 +1344,9 @@ u64 scsi_calculate_bounce_limit(struct S if (host_dev && host_dev->dma_mask) bounce_limit = *host_dev->dma_mask; + if (shost->bounce_high && (bounce_limit > BLK_BOUNCE_HIGH)) + return BLK_BOUNCE_HIGH; + return bounce_limit; } EXPORT_SYMBOL(scsi_calculate_bounce_limit); diff -purN a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h --- a/include/scsi/scsi_host.h 2005-03-25 22:28:16.000000000 -0500 +++ b/include/scsi/scsi_host.h 2005-03-31 15:16:54.000000000 -0500 @@ -502,6 +502,12 @@ struct Scsi_Host { unsigned reverse_ordering:1; /* + * Host needs all high memory bounced (useful for ide-scsi, + * so it doesn't have to kmap when doing PIO) + */ + unsigned bounce_high:1; + + /* * Host has rejected a command because it was busy. */ unsigned int host_blocked;
kmap fix looks the better one to me. As to comment well the old IDE layer is scheduled for termination, and its Bartlomiej rather than the scsi folks who maintains that file.
Thanks, I wasn't aware of that! I'll try sending it out to the ide list.
OK, much better... Bart accepted the kmap_atomic() patch, with "#ifdef CONFIG_HIGHMEM"s wrapped around the additions. Here's the actual patch he took (it's against 2.6.11.6). If it would be helpful, I could apply this patch to RHEL4 and make a new patch file that's specifically for RHEL4. diff -purN a/drivers/scsi/ide-scsi.c b2/drivers/scsi/ide-scsi.c --- a/drivers/scsi/ide-scsi.c 2005-03-25 22:28:23.000000000 -0500 +++ b2/drivers/scsi/ide-scsi.c 2005-04-05 15:56:48.000000000 -0400 @@ -143,6 +143,9 @@ static void idescsi_input_buffers (ide_d { int count; char *buf; +#ifdef CONFIG_HIGHMEM + unsigned long flags; +#endif while (bcount) { if (pc->sg - (struct scatterlist *) pc->scsi_cmd- >request_buffer > pc->scsi_cmd->use_sg) { @@ -151,8 +154,15 @@ static void idescsi_input_buffers (ide_d return; } count = min(pc->sg->length - pc->b_count, bcount); - buf = page_address(pc->sg->page) + pc->sg->offset; +#ifdef CONFIG_HIGHMEM + local_irq_save(flags); +#endif + buf = kmap_atomic(pc->sg->page, KM_USER0) + pc->sg->offset; drive->hwif->atapi_input_bytes(drive, buf + pc->b_count, count); + kunmap_atomic(buf - pc->sg->offset, KM_USER0); +#ifdef CONFIG_HIGHMEM + local_irq_restore(flags); +#endif bcount -= count; pc->b_count += count; if (pc->b_count == pc->sg->length) { pc->sg++; @@ -165,6 +175,9 @@ static void idescsi_output_buffers (ide_ { int count; char *buf; +#ifdef CONFIG_HIGHMEM + unsigned long flags; +#endif while (bcount) { if (pc->sg - (struct scatterlist *) pc->scsi_cmd- >request_buffer > pc->scsi_cmd->use_sg) { @@ -173,8 +186,15 @@ static void idescsi_output_buffers (ide_ return; } count = min(pc->sg->length - pc->b_count, bcount); - buf = page_address(pc->sg->page) + pc->sg->offset; +#ifdef CONFIG_HIGHMEM + local_irq_save(flags); +#endif + buf = kmap_atomic(pc->sg->page, KM_USER0) + pc->sg->offset; drive->hwif->atapi_output_bytes(drive, buf + pc->b_count, count); + kunmap_atomic(buf - pc->sg->offset, KM_USER0); +#ifdef CONFIG_HIGHMEM + local_irq_restore(flags); +#endif bcount -= count; pc->b_count += count; if (pc->b_count == pc->sg->length) { pc->sg++;
Thanks for the offer (to rediff against rhel4), but i think what's needed is clear enough.
When will this fix be merged into RHEL4?
see comment #13
We discovered with additional testing that the patch fails, because the kmap_atomic() window KM_USER0 is used several places in the kernel without having IRQs masked (see include/linux/highmem.h for example). When an IDE IRQ occurs when KM_USER0 is already mapped, there's a problem. The patch works fine using KM_IRQ0, which is apparently only used when IRQs are masked.
...but kmaps are per-processor, so if irqs are locally disabled, an irq on another cpu wouldn't use this kmap. is it possible that there is a schedule while the kmap is held?
IRQs aren't locally disabled while the kmap is held--that's the problem. IRQs are locally disabled when the ide-scsi patch (above) holds the kmap, but they aren't at other places in the kernel (see include/linux/highmem.h functions, for example). And, since the ide-scsi patch is doing the kmap within the interrupt handler, we've got a problem if an ide-scsi IRQ interrupts one of those functions in include/linux/highmem.c.
Is there any solution to this problem?
Created attachment 116617 [details] Patch to implement kmap functionality in ide-scsi module Slightly cleaner version of the patch as accepted upstream. This patch has been submitted for internal review and possible inclusion in RHEL4 U2.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-514.html