Bug 169115

Summary: Kernel panic when more than 4GB RAM and no IOMMU
Product: Red Hat Enterprise Linux 4 Reporter: Milan Kerslager <milan.kerslager>
Component: kernelAssignee: Jim Paradis <jparadis>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: high    
Version: 4.0CC: berthiaume_wayne, bimod.n, brett.morrow, hansecke, jbaron, netllama, ohegarty, pan_haifeng, peterm, redhat
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-25 19:21:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Boot to crash
none
dmesg output none

Description Milan Kerslager 2005-09-23 08:05:55 UTC
I was not able to boot RHEL4U1 (2.6.9-11.EL) kernel on ASUS A8N-E with AMD
Athlon 64 X2 Dual Core Processor 4400+ and 4GB of RAM because the kernel stopped
with this message after kernel boot:

Kernel panic: PCI-DMA: high address but no IOMMU

I was able to boot only with 2GB or RAM (dual channel so one half) and the 32bit
kernel worked. The kernel panic was partially fixed by updating BIOS to the
latest beta (rev 1009 from ASUS site). Now when booting 2.6.9-11.ELsmp x86_64
kernel I've got this message:

PCI-DMA: More than 4GB of RAM and no IOMMU
PCI-DMA: 32bit PCI IO may malfunction.<6> PCI DMA: disabling IOMMU

The kernel seems to work then. I'm able to completly avoid this message by
passing mem=4094M to the kernel via Grub.

What does this mean? Crap motherboard, BIOS or the kernel has a shortage in this
area? Is the IOMMU present or not at all? Is there a performance hit when using
more than 4GB of RAM?

Comment 1 Milan Kerslager 2005-09-23 08:35:16 UTC
Sorry... The kernel panic right after HW inicialization (inserting of modules)
even BIOS has been updated (and with no mem= line passed to the kernel) with
this message:

Kernel panic - not syncing: PCI-DMA: high address but no IOMMU.

So with no mem= or old BIOS the kernel panic just before "audit(...):
initialized". With latest BIOS the kernel boots but panic when modules are
loaded (probably forcedeth module for NIC).

With latest BIOS and mem=4094M the system boots and seems to be able to run.

Comment 2 Milan Kerslager 2005-09-28 05:12:20 UTC
I tryed the latest kernel from RHEL4 Beta channel
(kernel-smp-2.6.9-17.EL.x86_64.rpm) with no luck. Still kernel panic without
mem=4094M parameter.

The system is still unable to handle more than 4GB RAM.

The system has troubles with clock instability too, see bug #168255.

Comment 3 Brett Morrow 2005-09-28 14:27:46 UTC
I have the same problem on supermicro h8dce motherboard.  There seems to be a
workaround for the latest bios. It has an OS options of "linux kernel 2.6.9". No
other settings seems to help.  With this set, I was able to load the OS and run
the OS as long as I do not choose the SMP kernel (I just kernel panic).  So is
this just a 2.6.9 codebase problem?


Comment 4 Brett Morrow 2005-09-29 17:45:26 UTC
With the 2.6.9-22ELsmp I am able to run just fine.



Comment 5 Brett Morrow 2005-09-29 21:27:42 UTC
With the 2.6.9-22ELsmp I am able to run just fine.

Comment 6 Brett Morrow 2005-09-29 21:29:36 UTC
Sorry for the repeat, got a "Mid Air Colision" message for bugzilla and told it
to submit anyway.  Here is where I got the kernel that worked for me.

http://people.redhat.com/~jbaron/rhel4/

Comment 7 Milan Kerslager 2005-09-30 11:44:26 UTC
Kernel .22 did not helped here (ASUS A8N-E, Athlon 64 X2 Dual Core 4400+ and 4GB
of RAM), I still have to pass to kernel option mem=4094M. Without his I've got:

PCI-DMA: More than 4GB og RAM and no IOMMU
PCI-DMA: 32bit PCI IO may mulfunction.<6>PCI-DMA:Disabling IOMMU
...
Kernel panic - not syncing:PCI-DMA: high address but no IOMMU

Comment 8 Jim Paradis 2005-09-30 14:43:07 UTC
Can you send us a serial console dump for this issue?  I'd like to see a little
more context.


Comment 9 Brett Morrow 2005-10-03 15:42:51 UTC
Ok, I double checked my bios and even with the new kernel, if the bios is not
set to "Boot OS Linux Kernel 2.6.9"   I still kernel panic:

Initializing hardware...  storage network audio done[  OK  ]
Configuring kernel parameters:  [  OK  ]
Kernel panic - not syncing: PCI-DMA: high address but no IOMMU.

Comment 10 Brett Morrow 2005-10-06 02:26:35 UTC
I see update 2 with kernel version .22 was released.  Has anyone tried it to see
if a fix is in the released kernel?


Comment 11 Brett Morrow 2005-10-06 23:31:44 UTC
Got to try this today.  On the system with the option to specify the "Linux
Kernel OS 2.6.9", if I have this option set, it boots and loads.  If I have the
option set to other OS it dies.  

This is with the WS4U2 released yesterday.


Comment 12 Brett Morrow 2005-10-07 15:18:07 UTC
Here is what I get on the console(this is with WS4 U2- released):

Starting udev:  [  OK  ]
Initializing hardware...  storage network audio done[  OK  ]
Configuring kernel parameters:  [  OK  ]
Kernel panic - not syncing: PCI-DMA: high address but no IOMMU.

Comment 13 Jim Paradis 2005-10-07 16:54:19 UTC
Can you hook up a serial line to the system and capture all of the console
messages from boot to crash?  You would have to add flags such as
"console=ttyS0,115200 console=tty0" to the boot command line.  We can't help
until we know more about your system.


Comment 14 Brett Morrow 2005-10-07 18:56:34 UTC
Created attachment 119721 [details]
Boot to crash

Comment 15 Brett Morrow 2005-10-20 02:15:39 UTC
Any word on this?


Comment 16 Brett Morrow 2005-10-27 17:39:43 UTC
I submitted the information you asked for almost 3 weeks ago.  Any word on a fix
or help on this?  I saw a new kernel was released today that fixed other
problems I have had, but no mention of this.

Thanks

-Brett


Comment 17 Milan Kerslager 2005-10-27 21:56:17 UTC
Until U3 there will be no fixes than security fixes (or really serious bugfixes
for big RH customers). But yes - testing kernels are welcome here too...

Comment 18 Brett Morrow 2005-10-27 22:35:04 UTC
This problems dates back before U2 was released.  Why no fix then?  Why no
progress on a fix before U3?  


Comment 19 Jim Paradis 2005-11-10 21:18:58 UTC
It turns out to be a PCI config space access issue.  A temporary workaround is
to specify the boot parameter "pci=nommconf".  I have a patch that will
auto-detect this platform and specify the correct type of PCI config space access.


Comment 21 Milan Kerslager 2005-11-24 21:36:19 UTC
I'm confirming, that workaround from the comment #19 works here. Is there a
possibility to test this patch (I'm able to compile own kernel)?

Comment 22 Milan Kerslager 2005-11-24 21:57:49 UTC
Oops. The machine boots but hangs as soon as higher load come :-( Repeated three
times there.

Comment 23 Jim Paradis 2005-11-28 18:52:27 UTC
What kind of load are you running?  We have a similar configuration here at Red
Hat (which I used to come up with the workaround in Comment #19), and I've not
heard any reports of mysterious hangs...

Comment 24 Milan Kerslager 2005-11-29 11:20:59 UTC
Hang has been seen at ASUS A8N-E with AMD Athlon 64 X2 Dual Core Processor 4400+
and 4GB of RAM. I have no dirrect access to this machine so if you need more
info I have to talk with the responsible person. But I would like to provide as
much information as possible.

Comment 25 josip 2005-11-29 19:59:50 UTC
See http://lkml.org/lkml/2005/11/6/54 for some ideas -- try booting with
"iommu=soft swiotlb=65536".  This bug may be a duplicate of bug #166437 which
was fixed in 2.6.13-based kernels but reappeared in 2.6.14-based kernels.

Comment 26 josip 2005-11-30 14:42:27 UTC
While "iommu=soft swiotlb=65536" works, it may be better to use "pci=nommconf"
as recommended by Andi Kleen at http://bugzilla.kernel.org/show_bug.cgi?id=5343
-- the problem he reports is that the MCFG table provided by ACPI BIOS is broken
and needs a fix.  He is developing a workaround.

The MCFG table describes the memory mapped PCI configuration space, which is
required for the MMCONF form of access to devices on the PCI-Express bus
(otherwise, one must address them through BIOS or directly).

Anyway, using the boot line option "pci=nommconf" works for me, and I'll use
this until Andi's workaround arrives.

Comment 27 Jim Paradis 2005-12-01 18:40:19 UTC
Hmmm... Comment #18 and Comment #19 seem to be lifted straight from Bug 166437.
 Can you tell me if either of the workarounds ("iommu=soft swiotlb=65536" or
"pci=nommconf") works for you and prevents the hang from happening?


Comment 28 BJ Wolbers 2006-01-04 13:49:27 UTC
(In reply to comment #27)
> Hmmm... Comment #18 and Comment #19 seem to be lifted straight from Bug 166437.
>  Can you tell me if either of the workarounds ("iommu=soft swiotlb=65536" or
> "pci=nommconf") works for you and prevents the hang from happening?
> 

I am facing the same problem, I'm running an Athlon X2 3800 on a Biostar NF4
mobo with 4GB RAM and an Adaptec 29320A Ultra320 SCSI adapter. I've been facing
instability for 3 months now. I started with FC3 and the 2.6.12 kernel, this was
very unstable, because there was a bug with the SCSI adapter and memory
addressing. 2.6.13 and later solved this problem, but the system still locks up
at random times. My server usually locks up within 1-50 hours, uptime has never
been more than roughly 2 days. I've upgraded to FC5 Test 1 last week but even
the latest 2.6.15 kernel didn't solve this.
I've brought the memory usage down by using "mem=3500M" (been using that option
for 2 months now with various kernels), but the system is still unstable. I just
wanted to eliminate the 4GB+ problem, but apparently it's still unstable. It
always boots, but it just locks up at random. I'm using "pci=nommconf" on top of
"mem=3500M" now, but it crashed after about 8 hours of uptime. I think this is
not working either, so if it crashes one more time I'll try "iommu=soft
swiotlb=65536" and keep you updated. I've posted dmesg below:

Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 mem=3500M
pci=nommconf)
Linux version 2.6.14-1.1808_FC5 (bhcompile.redhat.com) (gcc
version 4.1.0 20051222 (Red Hat 4.1.0-0.12)) #1 SMP Mon Jan 2 17:11:48 EST 2006
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
 BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000ceef0000 (usable)
 BIOS-e820: 00000000ceef0000 - 00000000ceef3000 (ACPI NVS)
 BIOS-e820: 00000000ceef3000 - 00000000cef00000 (ACPI data)
 BIOS-e820: 00000000df000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
ACPI: RSDP (v000 Nvidia                                ) @ 0x00000000000f7c00
ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x00000000ceef3040
ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x00000000ceef30c0
ACPI: MCFG (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x00000000ceef7ec0
ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x00000000ceef7e00
ACPI: DSDT (v001 NVIDIA AWRDACPI 0x00001000 MSFT 0x0100000e) @ 0x0000000000000000
Scanning NUMA topology in Northbridge 24
Number of nodes 1
Node 0 MemBase 0000000000000000 Limit 00000000dac00000
Using 63 for the hash shift.
Using node hash shift of 63
Bootmem setup node 0 0000000000000000-00000000dac00000
On node 0 totalpages: 830407
  DMA zone: 2581 pages, LIFO batch:0
  DMA32 zone: 827826 pages, LIFO batch:31
  Normal zone: 0 pages, LIFO batch:0
  HighMem zone: 0 pages, LIFO batch:0
Nvidia board detected. Ignoring ACPI timer override.
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:3 APIC version 16
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 15:3 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Setting APIC routing to physical flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at d0000000 (gap: cef00000:10100000)
Checking aperture...
CPU 0: aperture @ 400000000 size 32 MB
Aperture from northbridge cpu 0 too small (32 MB)
No AGP bridge found
SMP: Allowing 2 CPUs, 0 hotplug CPUs
Built 1 zonelists
Kernel command line: ro root=/dev/VolGroup00/LogVol00 mem=3500M pci=nommconf
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2000.056 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 3313364k/3584000k available (2336k kernel code, 76636k reserved, 1641k
data, 224k init)
Calibrating delay using timer specific routine.. 4006.56 BogoMIPS (lpj=8013129)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(2) -> Node 0 -> Core 0
mtrr: v2.0 (20020519)
Using local APIC timer interrupts.
Detected 12.500 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4000.26 BogoMIPS (lpj=8000526)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1(2) -> Node 0 -> Core 1
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff 1 cycles, maxerr 545 cycles)
Brought up 2 CPUs
Disabling vsyscall due to use of PM timer
time.c: Using PM based timekeeping.
testing NMI watchdog ... OK.
checking if image is initramfs... it is
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Subsystem revision 20050902
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:00:05.0
PCI: Transparent bridge - 0000:00:10.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK2] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 5 7 9 10 *11 14 15)
ACPI: PCI Interrupt Link [LNK4] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK5] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK6] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK7] (IRQs *5 7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LNK8] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LUBA] (IRQs *5 7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LMAC] (IRQs 5 *7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LAZA] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LPMU] (IRQs 5 7 9 10 *11 14 15)
ACPI: PCI Interrupt Link [LMCI] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 5 *7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs 5 7 9 *10 11 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSID] (IRQs 5 7 9 *10 11 14 15)
ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
ACPI: PCI Interrupt Link [APC5] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC6] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC7] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC8] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APMU] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [AAZA] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 10 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI: Cannot allocate resource region 1 of device 0000:00:05.0
PCI-DMA: Disabling IOMMU.
pnp: 00:00: ioport range 0x1000-0x107f could not be reserved
pnp: 00:00: ioport range 0x1080-0x10ff has been reserved
pnp: 00:00: ioport range 0x1400-0x147f has been reserved
pnp: 00:00: ioport range 0x1480-0x14ff could not be reserved
pnp: 00:00: ioport range 0x1800-0x187f has been reserved
pnp: 00:00: ioport range 0x1880-0x18ff has been reserved
pnp: 00:00: ioport range 0x2000-0x207f has been reserved
pnp: 00:00: ioport range 0x2080-0x20ff has been reserved
PCI: Bridge: 0000:00:02.0
  IO window: c000-cfff
  MEM window: fde00000-fdefffff
  PREFETCH window: fdd00000-fddfffff
PCI: Bridge: 0000:00:03.0
  IO window: b000-bfff
  MEM window: fdc00000-fdcfffff
  PREFETCH window: fdb00000-fdbfffff
PCI: Bridge: 0000:00:04.0
  IO window: 9000-9fff
  MEM window: fd800000-fd8fffff
  PREFETCH window: fd700000-fd7fffff
PCI: Bridge: 0000:00:10.0
  IO window: a000-afff
  MEM window: fda00000-fdafffff
  PREFETCH window: fd900000-fd9fffff
PCI: Setting latency timer of device 0000:00:02.0 to 64
PCI: Setting latency timer of device 0000:00:03.0 to 64
PCI: Setting latency timer of device 0000:00:04.0 to 64
PCI: Setting latency timer of device 0000:00:10.0 to 64
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1136361157.884:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Initializing Cryptographic API
ksign: Installing public key data
Loading keyring
- Added public key A4BEF3EBE9F4C53E
- User ID: Red Hat, Inc. (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
PCI: Setting latency timer of device 0000:00:02.0 to 64
pcie_portdrv_probe->Dev[02fc:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[pcie00]
PCI: Setting latency timer of device 0000:00:03.0 to 64
pcie_portdrv_probe->Dev[02fd:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[pcie00]
PCI: Setting latency timer of device 0000:00:04.0 to 64
pcie_portdrv_probe->Dev[02fb:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[pcie00]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: Fan [FAN] (on)
ACPI: Thermal Zone [THRM] (40 C)
Real Time Clock Driver v1.12
Linux agpgart interface v0.101 (c) Dave Jones
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing enabled
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-MCP51: IDE controller at PCI slot 0000:00:0d.0
NFORCE-MCP51: chipset revision 161
NFORCE-MCP51: not 100% native mode: will probe irqs later
NFORCE-MCP51: 0000:00:0d.0 (rev a1) UDMA133 controller
    ide0: BM-DMA at 0xf400-0xf407, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xf408-0xf40f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ide-floppy driver 0.99.newide
usbcore: registered new driver libusual
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
NET: Registered protocol family 2
input: AT Translated Set 2 keyboard as /class/input/input0
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 131072 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 9, 2097152 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
powernow-k8: Found 2 AMD Athlon 64 / Opteron processors (version 1.50.4)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
ACPI wakeup devices: 
SLPB HUB0 XVRA XVRB XVRC USB0 USB2 AZAD MMAC MMCI 
ACPI: (supports S0 S1 S4 S5)
Freeing unused kernel memory: 224k freed
Write protecting the kernel read-only data: 862k
SCSI subsystem initialized
ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
GSI 16 sharing vector 0xD1 and IRQ 16
ACPI: PCI Interrupt 0000:04:06.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 209
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec 29320A Ultra320 SCSI adapter>
        aic7901: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs

input: PS/2 Generic Mouse as /class/input/input1
  Vendor: SEAGATE   Model: ST336754LW        Rev: 0003
  Type:   Direct-Access                      ANSI SCSI revision: 03
 target0:0:0: asynchronous.
scsi0:A:0:0: Tagged Queuing enabled.  Depth 4
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous.
 target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP
(6.25 ns, offset 63)
 target0:0:0: Ending Domain Validation
SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB)
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
libata version 1.20 loaded.
sata_nv 0000:00:0e.0: version 0.8
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 23
GSI 17 sharing vector 0xD9 and IRQ 17
ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [APSI] -> GSI 23 (level, low) -> IRQ 217
PCI: Setting latency timer of device 0000:00:0e.0 to 64
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xE000 irq 217
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xE008 irq 217
ata1: no device found (phy stat 00000000)
scsi1 : sata_nv
ata2: no device found (phy stat 00000000)
scsi2 : sata_nv
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: dm-0: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 3050630
ext3_orphan_cleanup: deleting unreferenced inode 3051701
ext3_orphan_cleanup: deleting unreferenced inode 3051706
ext3_orphan_cleanup: deleting unreferenced inode 320035
ext3_orphan_cleanup: deleting unreferenced inode 321965
ext3_orphan_cleanup: deleting unreferenced inode 326585
ext3_orphan_cleanup: deleting unreferenced inode 321457
ext3_orphan_cleanup: deleting unreferenced inode 316056
ext3_orphan_cleanup: deleting unreferenced inode 317304
ext3_orphan_cleanup: deleting unreferenced inode 316928
ext3_orphan_cleanup: deleting unreferenced inode 2425010
ext3_orphan_cleanup: deleting unreferenced inode 3722501
ext3_orphan_cleanup: deleting unreferenced inode 321980
ext3_orphan_cleanup: deleting unreferenced inode 321979
ext3_orphan_cleanup: deleting unreferenced inode 318119
ext3_orphan_cleanup: deleting unreferenced inode 321583
ext3_orphan_cleanup: deleting unreferenced inode 316018
ext3_orphan_cleanup: deleting unreferenced inode 317347
ext3_orphan_cleanup: deleting unreferenced inode 323463
ext3_orphan_cleanup: deleting unreferenced inode 320110
ext3_orphan_cleanup: deleting unreferenced inode 320954
ext3_orphan_cleanup: deleting unreferenced inode 321365
ext3_orphan_cleanup: deleting unreferenced inode 317336
ext3_orphan_cleanup: deleting unreferenced inode 321964
ext3_orphan_cleanup: deleting unreferenced inode 322585
ext3_orphan_cleanup: deleting unreferenced inode 320943
ext3_orphan_cleanup: deleting unreferenced inode 320057
ext3_orphan_cleanup: deleting unreferenced inode 322293
ext3_orphan_cleanup: deleting unreferenced inode 321975
ext3_orphan_cleanup: deleting unreferenced inode 526214
ext3_orphan_cleanup: deleting unreferenced inode 321744
ext3_orphan_cleanup: deleting unreferenced inode 320053
ext3_orphan_cleanup: deleting unreferenced inode 3915875
ext3_orphan_cleanup: deleting unreferenced inode 321157
ext3_orphan_cleanup: deleting unreferenced inode 321049
ext3_orphan_cleanup: deleting unreferenced inode 320839
ext3_orphan_cleanup: deleting unreferenced inode 320837
ext3_orphan_cleanup: deleting unreferenced inode 3915830
ext3_orphan_cleanup: deleting unreferenced inode 320810
ext3_orphan_cleanup: deleting unreferenced inode 320555
ext3_orphan_cleanup: deleting unreferenced inode 320705
ext3_orphan_cleanup: deleting unreferenced inode 320692
ext3_orphan_cleanup: deleting unreferenced inode 327322
ext3_orphan_cleanup: deleting unreferenced inode 3915862
ext3_orphan_cleanup: deleting unreferenced inode 213035
ext3_orphan_cleanup: deleting unreferenced inode 213029
ext3_orphan_cleanup: deleting unreferenced inode 320166
ext3_orphan_cleanup: deleting unreferenced inode 3915837
ext3_orphan_cleanup: deleting unreferenced inode 3915832
ext3_orphan_cleanup: deleting unreferenced inode 3915817
ext3_orphan_cleanup: deleting unreferenced inode 320165
ext3_orphan_cleanup: deleting unreferenced inode 3916171
ext3_orphan_cleanup: deleting unreferenced inode 315892
ext3_orphan_cleanup: deleting unreferenced inode 320077
ext3_orphan_cleanup: deleting unreferenced inode 3915816
ext3_orphan_cleanup: deleting unreferenced inode 3915812
ext3_orphan_cleanup: deleting unreferenced inode 3915809
ext3_orphan_cleanup: deleting unreferenced inode 319890
ext3_orphan_cleanup: deleting unreferenced inode 3915808
ext3_orphan_cleanup: deleting unreferenced inode 3916441
ext3_orphan_cleanup: deleting unreferenced inode 319889
ext3_orphan_cleanup: deleting unreferenced inode 319885
ext3_orphan_cleanup: deleting unreferenced inode 318571
ext3_orphan_cleanup: deleting unreferenced inode 318565
ext3_orphan_cleanup: deleting unreferenced inode 318562
ext3_orphan_cleanup: deleting unreferenced inode 318557
ext3_orphan_cleanup: deleting unreferenced inode 318555
ext3_orphan_cleanup: deleting unreferenced inode 318554
ext3_orphan_cleanup: deleting unreferenced inode 3915806
ext3_orphan_cleanup: deleting unreferenced inode 3915805
ext3_orphan_cleanup: deleting unreferenced inode 318433
ext3_orphan_cleanup: deleting unreferenced inode 318431
ext3_orphan_cleanup: deleting unreferenced inode 3915804
ext3_orphan_cleanup: deleting unreferenced inode 318295
ext3_orphan_cleanup: deleting unreferenced inode 318187
ext3_orphan_cleanup: deleting unreferenced inode 3916438
ext3_orphan_cleanup: deleting unreferenced inode 318183
ext3_orphan_cleanup: deleting unreferenced inode 3916439
ext3_orphan_cleanup: deleting unreferenced inode 3916177
ext3_orphan_cleanup: deleting unreferenced inode 3916437
ext3_orphan_cleanup: deleting unreferenced inode 317570
ext3_orphan_cleanup: deleting unreferenced inode 318553
ext3_orphan_cleanup: deleting unreferenced inode 317971
ext3_orphan_cleanup: deleting unreferenced inode 3915798
ext3_orphan_cleanup: deleting unreferenced inode 3916175
ext3_orphan_cleanup: deleting unreferenced inode 320535
ext3_orphan_cleanup: deleting unreferenced inode 3916173
ext3_orphan_cleanup: deleting unreferenced inode 320327
ext3_orphan_cleanup: deleting unreferenced inode 320461
ext3_orphan_cleanup: deleting unreferenced inode 3915995
ext3_orphan_cleanup: deleting unreferenced inode 3915920
ext3_orphan_cleanup: deleting unreferenced inode 3047619
ext3_orphan_cleanup: deleting unreferenced inode 3047598
ext3_orphan_cleanup: deleting unreferenced inode 3047593
ext3_orphan_cleanup: deleting unreferenced inode 3047503
EXT3-fs: dm-0: 95 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
SELinux:  Disabled at runtime.
SELinux:  Unregistering netfilter hooks
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.48.
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22
GSI 18 sharing vector 0xE1 and IRQ 18
ACPI: PCI Interrupt 0000:00:14.0[A] -> Link [APCH] -> GSI 22 (level, low) -> IRQ 225
PCI: Setting latency timer of device 0000:00:14.0 to 64
eth0: forcedeth.c: subsystem: 01565:2501 bound to 0000:00:14.0
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Sleep Button (CM) [SLPB]
ibm_acpi: ec object not found
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
EXT3 FS on dm-0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 2031608k swap on /dev/VolGroup00/LogVol01.  Priority:-1 extents:1
across:2031608k

Comment 29 Linda Wang 2006-01-18 04:39:47 UTC
For RHEL4 users, can you please provide feedback as requested by
the developer on comment#27?

For Fodera Core users, can you please file a seperate bug, so that
we can track it with FC?  Mostly because the kernel base are different
between RHEL4 and FC5.


Comment 30 Bimod 2006-01-18 07:49:54 UTC
I have the same problem with RHELU2 on a similar H/W with more than 4GB memory,
the workaround "pci=nommconf" seems resolve this.

Another workaround i observed is to disable "ACPI MCFG Table" in BIOS 
(i.e if there is an option in BIOS)


Comment 31 Asgeir Nilsen 2006-02-15 13:46:04 UTC
Created attachment 124675 [details]
dmesg output

We're also experiencing this issue with a Fujitsu-Siemens Celsius V830
(Gigabyte dual Opteron motherboard).  A regular boot panics with Kernel panic:
PCI-DMA: high address but no IOMMU.

The attached dmesg is the output when using the pci=nommconf boot option.

Comment 32 Orla Hegarty 2006-02-15 17:36:29 UTC
*** Bug 177477 has been marked as a duplicate of this bug. ***

Comment 33 Orla Hegarty 2006-02-15 17:39:06 UTC
Similar problem occured while installing RHEL 4 U2 CD set. ( x86_64 ) on HP
Prolient DL145 Generation 2 Server 1 U rack mount with 4GB of ram and two AMD CPUs


Found a workaround in the BIOS for this hardware ...

WORKAROUND
----------
1. Go into the bios -> Advanced -> MCFG Table 
2. Set the option to DISABLED
 



Comment 40 RHEL Program Management 2006-08-16 20:57:26 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this enhancement by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This enhancement is not yet committed for inclusion in an Update
release.

Comment 43 Pan Haifeng 2006-08-31 19:35:59 UTC
Similar problem occured while RHEL4 U3 with IBM x3755 AMD server with 4GB RAM. 
Workaround add "pci=nommconf" on boot parameter.

Comment 46 Jim Paradis 2006-09-15 22:32:29 UTC
If the original problem is fixed and the NIC is not working, I suggest filing
the latter as a separate issue.

Comment 47 Jim Paradis 2006-09-25 19:21:05 UTC
I'm closing this bug as a duplicate of Bug 191039.  

*** This bug has been marked as a duplicate of 191039 ***

Comment 48 Andrius Benokraitis 2006-09-25 20:53:20 UTC
This was fixed in RHEL 4 U4:

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html