Description of problem: Second kernel hang with intel_iommu=on ======================================================== [root@hp-dl380g7-01 ~]# dmesg | grep IOMMU Intel-IOMMU: enabled IOMMU e7ffe000: ver 1:0 cap c90780106f0462 ecap f0207e IOMMU 0xe7ffe000: using Queued invalidation IOMMU: Setting RMRR: IOMMU: Setting identity map for device 0000:05:00.0 [0xdf63e000 - 0xdf640000] IOMMU: Setting identity map for device 0000:02:00.0 [0xdf63e000 - 0xdf640000] IOMMU: Setting identity map for device 0000:02:00.2 [0xdf63e000 - 0xdf640000] IOMMU: Setting identity map for device 0000:00:1d.0 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:00:1d.1 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:00:1d.2 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:00:1d.3 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:02:00.0 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:02:00.2 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:02:00.4 [0xdf7f5000 - 0xdf7fb000] IOMMU: Setting identity map for device 0000:00:1d.7 [0xdf7fc000 - 0xdf7fe000] IOMMU: Prepare 0-16MiB unity mapping for LPC IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0x1000000] [root@hp-dl380g7-01 ~]# uname -a Linux hp-dl380g7-01.lab.bos.redhat.com 2.6.18-273.el5 #1 SMP Mon Jul 4 14:12:24 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@hp-dl380g7-01 ~]# service kdump restart Stopping kdump:[确定] No kdump initial ramdisk found.[警告] Rebuilding /boot/initrd-2.6.18-273.el5kdump.img Starting kdump:[确定] [root@hp-dl380g7-01 ~]# echo c > /proc/sysrq-trigger ---------------------------------------------------------------------------------------------- SysRq : Trigger a crashdump Linux version 2.6.18-273.el5 (mockbuild.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Mon Jul 4 14:12:24 EDT 2011 Command line: ro root=/dev/VolGroup00/LogVol00 consoS1,115200n81 intel_iommu=on irqpoll maxcpus=1 reset_devices memmap=exactmap memmap=573K@64K memmap=6096K@32768K memmap=124387K@39437K elfcorehdr=163824K memmap=3K$637K memmap=52K#3659964K memmap=75532K$366memmap=2112K$4173824K memmap=8192K$4186112K BIOS-provided physical RAM map: BIOS-e820: 0000000000010000 - 000000000009f400 (usable) BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved) BIOS-e820000000100000 - 00000000df62f000 (usable) BIOS-e820: 00000000df62f000 - 00000000df63c000 (ACPI data) BIOS-e820: 00000000df63c000 - 00000000df63d000 (usable) BIOS-e820: 00000000df63d000 - 00000000e4000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 000000031ffff000 (usable) user-defined physical RAM map: user: 0000000000010000 - 000000000009f400 (usable) user: 000000000009f400 - 00000000000a0000 (reserved) user: 0000000002000000 - 00000000025f4000 (usable) user: 0000000002683400 - 0000000009ffc000 (usable) user: 00000000df62f000 - 00000000df63c000 (ACPI data) user: 00000000df63d000 - 00000000e4000000 (reserved) user: 00000000fec00000 - 00000000fee10000 (reserved) user: 00000000ff800000 - 000000000 (reserved) DMI 2.6 present. SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 0 -> APIC 2 - 0 SRAT: PXM 0 -> APIC 3 -> Node 0 SRAT: PXM 0 -> APIC 18 -> Node 0 SRAT: PXM 0 -> APIC 19 -> Node 0 SRAT: PXM 0 -> APIC 20 -> Node 0 SRAT: PXM 0 -> APIC 21 -> Node 0 SRAT: PXM 1 -> APIC 32 -> Node 1 SRAT: PXM 1 -> APIC 33 -> Node 1 SRAT: PXM 1 -> APIC 34 -> Node 1 SRAT: PXM 1 -> APIC 35 -> Node 1 SRAT: PXM 1 -> APIC 50 -> Node 1 SRAT: PXM 1 -> APIC 51 -> Node 1 SRAT: PXM 1 -> APIC 52 -> Node 1 SRAT: PXM 1 -> APIC 53 -> Node 1 SRAT: Node 0 PXM 0 0-e0000000 SRAT: Node 0 PXM 0 0-1a0000000 SRAT: Node 1 PXM 1 1a0000000-320000000 Bootmem setup node 0 0000000000000000-0000000009ffc00mory for crash kernel (0x0 to 0x0) notwithin permissible range disabling kdump ACPI: PM-Timer IO Port: 0x908 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled) Processor #32 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x08] lapic_id[0x10] disabled) ACPI: LAPIC (acpi_id[0x18] lapic_id[0x30] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled) ACPI: LAPIC (acpi_id[0x14] lapic_id[0x24] disabled) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x14] enabled) Processor #20 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x34] enabled) Processor #52 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) Processor #2 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x12] lapic_id[0x22] enabled) Processor #34 6:12 APIC 21 ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x12] enabled) Processor #18 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x1a] lapic_id enabled) Processor #50 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled) ACPI: LAPIC (acpi_id[0x16] lapic_id[0x26] disabled) ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x16] disabled) ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x36] disabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) Processor #1 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled) Processor #33 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x09] lapic_id[0x11] disabled) ACPI: LAPIC (acpi_id[0x19] lapic_id[0x31] disabled) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled) ACPI: LAPIC (acpi_id[0x15] lapic_id[0x25] disabled) ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x15] enabled) Processor #21 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x35] enabled) Processor #53 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled) Processor #3 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x13] lapic_id[0x23] enabled) Processor #35 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x13] enabled) Processor #19 6:12 APIC version 21 ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x33] enabled) Processor #51 6:1 version 21 ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled) ACPI: LAPIC (acpi_id[0x17] lapic_id[0x27] disabled) ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x17] disabled) ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x37] disabled) ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 8, version 3ress 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x00] address[0xfec80000] gsi_base[sion 32, address 0xfec80000, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Setting APIC routing to physical flat ACPI: d: 0x8086a201 base: 0xfed00000 Using ACPI (MADT) for SMP configuration information Nosave address range: 000000000009f000 - 00000000000a0000 Nosave address range: 00000000000a0000 - 0000000002000000 Nosavess range: 00000000025f4000 - 0000000002684000 Allocating PCI resources starting at 10000000 (gap: 9ffc000:d5633000) SMP: Allowing 32 CPUs, 16 hotplug CPUs Built 1 zonelists. Total pages: 32195 Kernel d line: ro root=/dev/VolGroup00/LogVol00 console=ttyS1,115200n81 intel_iommu=on irqpoll maxcpus=1 reset_devices memmap=exactmap memmap=573K@64K memmap=6096K@32768K memmap=124387K@39437K elfcorehdr=163824K memmap=3K$637K memmap=52K#3659964K memmap=75532K$3660020K memmap=2112K$4173824K memmap=8192K$4186112K Intel-IOMMU: enabled Misrouted IRQ fixup and polling support enabled This may significantly impact system performance Initializing CPU#0 PID hash table entries: 512 (order: 9, 4096 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 16384 (order: 5, 131072 bytes) Inode-cash table entries: 8192 (order: 4, 65536 bytes) Checking aperture... Memory: 115224k/163824k available (2603k kernel code, 15828k reserved, 1660k data, 224k init) Calibrating delay loop (skipped), value calculated using timer frequency.. 4798.78 BogoMIPS (lpj=2399392) Security Fra v1.0.0 initialized SELinux: Initializing. selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 256 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 256K CPU: L3 cache: 12288K CPU 0/34 -> Node 0 using mwait in idle threads. CPU: Physical Processor ID: 1 CPU: Processor Core ID: 10 MCE: Machine Check Exception Reporting is disabled. SMP alternatives: switching to UP code ACPI: Core revision 20060707 Using local APIC timer interrupts. Detected 8.331 MHz APIC timer. Brought up 1 CPUs NMI watchdog testing PASSED. time.c: Using 14.318180 MHz WALL HPET GTOD HPET/TSC timer. time.c: Detected 2399.392 MHz processor. checking if image is initramfs... it is Freeing initrd memory: 4715k freed NET: Registered protocol family 16 ACPI: bus type pci registered Warning: pci_mmcfg_init marking 256MB space uncacheable. MCFG table requires 64MB uncacheable only. Trooting with acpi_mcfg_max_pci_bus_num=on PCI: Using MMCONFIG at e0000000 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: No dock devices found. ACPI: PCI Root Bridge [PCI0] (0000) PCI: Transparent bridge - 0000:00:1e.0 ACPI: PCI Interrupt Link [LNKA] (IRQs 5 7 *10 11), disabled. ACPI: PCI Interrupt Link [LNKB] (IRQs 5 7 10 *11), disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 5 *7 10 11), disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs *5 7 10 11), disa. ACPI: PCI Interrupt Link [LNKE] (IRQs *5 7 10 11), disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 5 7 10 11) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 5 7 *10 11), disabled. ACPI: PCI Intert Link [LNKH] (IRQs 5 *7 10 11), disabled. Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 11 devices usbcore: registered new driver usbfs usbcore: registered ndriver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols =LABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default hpet0: at MMIO 0xfed00000 (virtual 0xffffffffff5fe000), IRQs 2, 8, 0, 0 hpet0: 4 64-bit timers, 14318180 Hz DMAR:Host address width 39 DMDRHD base: 0x000000e7ffe000 flags: 0x1 IOMMU e7ffe000: ver 1:0 cap c90780106f0462 ecap f0207e DMAR:RMRR base: 0x000000df7fc000 end: 0x000000df7fdfff AR:RMRR base: 0x000000df63e000 end: 0x000000df63ffffff DMAR:ATSR flags: 0x0 IOMMU <====================================================Hang Version-Release number of selected component (if applicable): kernel-2.6.18-273.el5 How reproducible: 100% on hp-dl380g7-01.lab.bos.redhat.com Steps to Reproduce: 1.Enable IOMMU 2.Start kdump service 3.Trigger crash Actual results: Second kernel Hang Expected results: vmcore captured Additional info:
Please try: "intel_iommu=on iommu=pt" It works for me on ibm-x3550m3-01.rhts.eng.nay.redhat.com, but this machine does not hang at same point it hangs after some DMAR fault
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Important: Disable IOMMU on Intel Chipsets A limitation in the current implementation of the Intel IOMMU driver can occasionally prevent the kdump service from capturing the core dump image. To use kdump on Intel architectures reliably, it is advised that the IOMMU support is disabled.