Description of problem: repeatable panic trying to start gimp Linux up 2.6.9-16.ELsmp #1 SMP Mon Aug 15 20:38:46 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux No vmcore available from the panic, got the oops via netdump: [root@amazon-2000 172.16.45.70-2005-08-23-02:52]# cat log Unable to handle kernel NULL pointer dereference at 0000000000000003 RIP: <ffffffff801648cb>{__bounce_end_io_read+69} PML4 122b39067 PGD 122b2e067 PMD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: netconsole netdump nfsd exportfs lockd md5 ipv6 parport_pc lp parport i2c_dev i2c_core sunrpc dm_mod button battery ac ohci_hcd ehci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc<ffffffff801648cb>{__bounce_end_io_read+69} RSP: 0018:ffffffff8044a820 EFLAGS: 00010202 RAX: 00000101253a7000 RBX: 000001012285e300 RCX: 0000000000017000 RDX: 0000000000000000 RSI: 000001012fc77400 RDI: 000001012285e380 RBP: 0000000000000000 R08: 000001012603fc00 R09: 0000010004f84938 R10: 00000101253a7c00 R11: 000001012285e380 R12: 0000000000000000 R13: 000001012fc77400 R14: 0000000000000000 R15: 0000000000017000 FS: 0000000000000000(0000) GS:ffffffff804d3300(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000003 CR3: 0000000000101000 CR4: 00000000000006e0 Process swapper (pid: 0, threadinfo ffffffff804d6000, task ffffffff803ca980) Stack: 0000000000017000 000001012285e380 0000000000000000 0000010126246eb8 ffffffff801649a2 000000000001e000 ffffffff80249aae <IRQ> <ffffffff801649a2>{bounce_end_io_read_isa+25} <ffffffff80249aae>{__end_that_request_first+238} <ffffffffa0006d7d>{:scsi_mod:scsi_end_request+40} <ffffffffa0007092>{:scsi_mod:scsi_io_completion+497} <ffffffffa0002d21>{:scsi_mod:scsi_softirq+213} <ffffffff8013b724>{__do_softirq+88} <ffffffff8013b7cd>{do_softirq+49} <ffffffff80112f77>{do_IRQ+328} <ffffffff8011061b>{ret_from_intr+0} <EOI> <ffffffff8010e6cc>{mwait_idle+86} <ffffffff8010e65c>{cpu_idle+26} <ffffffff804d967b>{start_kernel+470} <ffffffff804d91d5>{_sinittext+469} Code: 48 0f b6 42 03 49 b8 b7 6d db b6 6d db b6 6d 48 bf 00 00 00 RIP <ffffffff801648cb>{__bounce_end_io_read+69} RSP <ffffffff8044a820> CR2: 0000000000000003 Given that this is x86_64 with no highmem - I'm not even sure why it would be in this function
bounce code can get called even on 64-bit system, if the device can't dma to the requested address.
ok... do doing bounce isn't necessarily odd... except on x86_64 we can only bounce through the isa_dma_pool because without highmem we don't initialize any other bounce pool. Not directly related to the panic is a side issue... on this brand new hardware, why am I limited to isa range for dma... in part, we don't create a lowmem dma pool because there is no highmem region on the system. Another reason would seem to be that the scsi (sata) device is not being detected as being able to dma to the entire 4gb of ram on the system... could be bad support for the device, or bad bios, or bad detection code...
As I continue to dig... Reproduction simply takes cat'ing a file on the lvm array setup in this system, while access to the disk which is not part of the lvm array doesn't seem to have any issues. Haven't tracked down how yet, but it appears that IO from the lvm array is not being flagged as capable to do dma to all of memory, even though the underlying device can - and with no highmem zone, the dma must go through the isa dma region.
So... took lvm apart and used the bare drives... individually, they all work fine. So there seems to be something in the lvm path that is requiring the bounce IO, and having problems therein, which is not required by the bare devices.
Created attachment 122807 [details] patch to highmem.c Patch from upstream via Dell in IT 85468
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html