Hide Forgot
Description of problem: When trigger crash failed on dell-pesc1420-01.rhts.eng.bos.redhat.com, kdump failed to reboot when save vmcore failed: =================================================================== /mnt/tests/kernel/kdump/crash-sysrq-c /mnt/tests/kernel/kdump/crash-sysrq-c SysRq : Trigger a crash BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c0691faf>] sysrq_handle_crash+0xf/0x20 *pdpt = 00000000347f0001 *pde = 00000000346dd067 *pte = 0000000000000000 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu3/cache/index1/shared_cpu_map Modules linked in: sunrpc p4_clockmod ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport e1000 microcode dcdbas serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support e752x_edac edac_core ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ata_generic pata_acpi ata_piix dm_mod [last unloaded: scsi_wait_scan] Pid: 14844, comm: runtest.sh Not tainted (2.6.32-125.el6.i686 #1) PowerEdge SC1420 EIP: 0060:[<c0691faf>] EFLAGS: 00010096 CPU: 1 EIP is at sysrq_handle_crash+0xf/0x20 EAX: 00000063 EBX: 00000063 ECX: c0a024f4 EDX: 00000000 ESI: c0a2a4a0 EDI: 00000286 EBP: 00000000 ESP: f4d57f24 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process runtest.sh (pid: 14844, ti=f4d56000 task=f4df1ab0 task.ti=f4d56000) Stack: c06921cb c097da52 c09858c9 00000007 00000000 00000002 f4da6300 c0692230 <0> fffffffb c069226f b770d000 f6173880 c0573434 f4d57f9c 00000002 b770d000 <0> f4da6300 00000002 b770d000 c05733d0 c0526e70 f4d57f9c f4df1ab0 f4d57fb4 Call Trace: [<c06921cb>] ? __handle_sysrq+0xfb/0x160 [<c0692230>] ? write_sysrq_trigger+0x0/0x50 [<c069226f>] ? write_sysrq_trigger+0x3f/0x50 [<c0573434>] ? proc_reg_write+0x64/0xa0 [<c05733d0>] ? proc_reg_write+0x0/0xa0 [<c0526e70>] ? vfs_write+0xa0/0x190 [<c05278f1>] ? sys_write+0x41/0x70 [<c0823444>] ? syscall_call+0x7/0xb Code: a2 c0 01 0f b6 41 03 19 d2 f7 d2 83 e2 03 83 e0 cf c1 e2 04 09 d0 88 41 03 f3 c3 90 c7 05 28 24 a0 c0 01 00 00 00 0f ae f8 89 f6 <c6> 05 00 00 00 00 01 c3 89 f6 8d bc 27 00 00 00 00 8d 50 d0 83 EIP: [<c0691faf>] sysrq_handle_crash+0xf/0x20 SS:ESP 0068:f4d57f24 CR2: 0000000000000000 Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-125.el6.i686 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Mon Mar 21 10:04:54 EDT 2011 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fe8cc00 (usable) BIOS-e820: 000000003fe8cc00 - 000000003fe8ec00 (ACPI NVS) BIOS-e820: 000000003fe8ec00 - 000000003fe90c00 (ACPI data) BIOS-e820: 000000003fe90c00 - 0000000040000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) last_pfn = 0x3fe8c max_arch_pfn = 0x400000 user-defined physical RAM map: user: 0000000000000000 - 0000000000001000 (reserved) user: 0000000000001000 - 00000000000a0000 (usable) user: 00000000000f0000 - 0000000000100000 (reserved) user: 0000000002000000 - 0000000009f5b000 (usable) user: 0000000009f5b400 - 0000000009f60000 (usable) user: 0000000009fff000 - 000000000a000000 (usable) user: 000000003fe8cc00 - 000000003fe90c00 (ACPI data) user: 000000003fe90c00 - 0000000040000000 (reserved) user: 00000000e0000000 - 00000000f0000000 (reserved) user: 00000000fec00000 - 00000000fed00400 (reserved) user: 00000000fed20000 - 00000000feda0000 (reserved) user: 00000000fee00000 - 00000000fef00000 (reserved) user: 00000000ffb00000 - 0000000100000000 (reserved) DMI 2.3 present. last_pfn = 0xa000 max_arch_pfn = 0x400000 x86 PAT enabled: cpu 0, old 0x7010600070106, new 0x7010600070106 total RAM covered: 1023M Found optimal setting for mtrr clean up gran_size: 64K chunk_size: 2M num_reg: 2 lose cover RAM: 0G init_memory_mapping: 0000000000000000-000000000a000000 Using x86 segment limits to approximate NX protection RAMDISK: 09b44000 - 09f4fdc9 ACPI: RSDP 000fec10 00014 (v00 DELL ) ACPI: RSDT 000fcb6d 00040 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: FACP 000fcbad 00074 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: DSDT fffc1400 02FA2 (v01 DELL dt_ex 00001000 MSFT 0100000D) ACPI: FACS 3fe8cc00 00040 ACPI: SSDT fffc4599 00096 (v01 DELL st_ex 00001000 MSFT 0100000D) ACPI: APIC 000fcc21 0008A (v01 DELL PE1420 00000007 ASL 00000061) ACPI: BOOT 000fccab 00028 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: ASF! 000fccd3 00067 (v16 DELL PE1420 00000007 ASL 00000061) ACPI: MCFG 000fcd3a 0003E (v01 DELL PE1420 00000007 ASL 00000061) ACPI: HPET 000fcd78 00038 (v01 DELL PE1420 00000007 ASL 00000061) 0MB HIGHMEM available. 160MB LOWMEM available. mapped low ram: 0 - 0a000000 low ram: 0 - 0a000000 node 0 low ram: 00000000 - 0a000000 node 0 bootmap 00002000 - 00003400 (9 early reservations) ==> bootmem [0000000000 - 000a000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000] #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000] #3 [0002400000 - 0002becd90] TEXT DATA BSS ==> [0002400000 - 0002becd90] #4 [0009b44000 - 0009f4fdc9] RAMDISK ==> [0009b44000 - 0009f4fdc9] #5 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000] #6 [0002bed000 - 0002c05160] BRK ==> [0002bed000 - 0002c05160] #7 [0000007000 - 000000a000] PGTABLE ==> [0000007000 - 000000a000] #8 [0000002000 - 0000004000] BOOTMAP ==> [0000002000 - 0000004000] found SMP MP-table at [c00fe710] fe710 Zone PFN ranges: DMA 0x00000001 -> 0x00001000 Normal 0x00001000 -> 0x0000a000 HighMem 0x0000a000 -> 0x0000a000 Movable zone start PFN for each node early_node_map[4] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00002000 -> 0x00009f5b 0: 0x00009f5c -> 0x00009f60 0: 0x00009fff -> 0x0000a000 Using APIC driver default ACPI: PM-Timer IO Port: 0x808 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] enabled) ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24]) IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 24-47 ACPI: IOAPIC (id[0x0a] address[0xfec80800] gsi_base[48]) IOAPIC[2]: apic_id 10, version 32, address 0xfec80800, GSI 48-71 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Using ACPI (MADT) for SMP configuration information ACPI: HPET id: 0x8086a201 base: 0xfed00000 SMP: Allowing 4 CPUs, 0 hotplug CPUs PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 PM: Registered nosave memory: 0000000000100000 - 0000000002000000 PM: Registered nosave memory: 0000000009f5b000 - 0000000009f5c000 PM: Registered nosave memory: 0000000009f60000 - 0000000009fff000 Allocating PCI resources starting at 40000000 (gap: 40000000:a0000000) Booting paravirtualized kernel on bare hardware NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 14 pages/cpu @c2200000 s34584 r0 d22760 u524288 pcpu-alloc: s34584 r0 d22760 u524288 alloc=1*2097152 pcpu-alloc: [0] 0 1 2 3 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32447 Kernel command line: ro root=/dev/mapper/vg_dellpesc142001-lv_root rd_LVM_LV=vg_dellpesc142001/lv_root rd_LVM_LV=vg_dellpesc142001/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=ttyS0,115200 irqpoll maxcpus=1 reset_devices cgroup_disable=memory memmap=exactmap memmap=4K$0K memmap=636K@4K memmap=64K$960K memmap=130412K@32768K memmap=19K@163181K memmap=4K@163836K memmap=8K#1047091K memmap=8K#1047099K memmap=1469K$1047107K memmap=262144K$3670016K memmap=1025K$4173824K memmap=512K$4174976K memmap=1024K$4175872K memmap=5120K$4189184K elfcorehdr=163180K Misrouted IRQ fixup and polling support enabled This may significantly impact system performance Disabling memory control group subsystem PID hash table entries: 512 (order: -1, 2048 bytes) Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 Initializing HighMem for node 0 (00000000:00000000) Memory: 113984k/163840k available (4258k kernel code, 16940k reserved, 2260k data, 524k init, 0k highmem) virtual kernel memory layout: fixmap : 0xffad5000 - 0xfffff000 (5288 kB) pkmap : 0xff600000 - 0xff800000 (2048 kB) vmalloc : 0xca800000 - 0xff5fe000 ( 845 MB) lowmem : 0xc0000000 - 0xca000000 ( 160 MB) .init : 0xc2a5e000 - 0xc2ae1000 ( 524 kB) .data : 0xc2828bea - 0xc2a5df28 (2260 kB) .text : 0xc2400000 - 0xc2828bea (4258 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. Hierarchical RCU implementation. NR_IRQS:2304 nr_irqs:1024 Spurious LAPIC timer interrupt on cpu 0 Console: colour VGA+ 80x25 console [ttyS0] enabled HPET: 3 timers in total, 0 timers will be used for per-cpu timer Fast TSC calibration using PIT Detected 3192.089 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 6384.17 BogoMIPS (lpj=3192089) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. Mount-cache hash table entries: 512 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio CPU: Physical Processor ID: 3 CPU: Processor Core ID: 0 mce: CPU supports 4 MCE banks using mwait in idle threads. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code ACPI: Core revision 20090903 [-- MARK -- Tue Mar 29 01:35:00 2011] [-- MARK -- Tue Mar 29 01:40:00 2011] [-- MARK -- Tue Mar 29 01:45:00 2011] [-- MARK -- Tue Mar 29 01:50:00 2011] [-- MARK -- Tue Mar 29 01:55:00 2011] [-- MARK -- Tue Mar 29 02:00:00 2011] [-- MARK -- Tue Mar 29 02:05:00 2011] [-- MARK -- Tue Mar 29 02:10:00 2011] [-- MARK -- Tue Mar 29 02:15:00 2011] [-- MARK -- Tue Mar 29 02:20:00 2011] [-- MARK -- Tue Mar 29 02:25:00 2011] [-- MARK -- Tue Mar 29 02:30:00 2011] [-- MARK -- Tue Mar 29 02:35:00 2011] [-- MARK -- Tue Mar 29 02:40:00 2011] [-- MARK -- Tue Mar 29 02:45:00 2011] [-- MARK -- Tue Mar 29 02:50:00 2011] [-- MARK -- Tue Mar 29 02:55:00 2011] [-- MARK -- Tue Mar 29 03:00:00 2011] Version-Release number of selected component (if applicable): kernel-2.6.32-125 kexec-tools-2.0.0-174 How reproducible: dell-pesc1420-01.rhts.eng.bos.redhat.com Steps to Reproduce: 1.Install RHEL6.1-20110323.1 2.Use a wrong kdump.conf to make save vmcore failed 3. Actual results: system failed to reboot Expected results: system rebooted Additional info: https://beaker.engineering.redhat.com/recipes/138318
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Tested with latest kernel/kexec-tools: ============================================================================= [root@dell-pesc1420-01 ~]# service kdump status Kdump is operational [root@dell-pesc1420-01 ~]# rpm -q kernel kexec-tools kernel-2.6.32-132.el6.i686 kexec-tools-2.0.0-186.el6.i686 [root@dell-pesc1420-01 ~]# tail /etc/kdump.conf #kdump_post /var/crash/scripts/kdump-post.sh #extra_bins /usr/bin/lftp #disk_timeout 30 #extra_modules gfs2 #options modulename options #default shell ext4 /dev/mapper/vg_dellpesc142001-lv_root core_collector makedumpfile --nosuchoption default reboot ------------------------------------------------------------------------------ [cye@wlan-5-117 ~]$ console dell-pesc1420-01.rhts.eng.bos.redhat.com [Enter `^Ec?' for help] [-- MOTD -- http://intranet.corp.redhat.com/ic/intranet/ConServer] SysRq : Trigger a crash BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c069311f>] sysrq_handle_crash+0xf/0x20 *pdpt = 0000000001457001 *pde = 000000003da4e067 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.2/0000:03:0e.0/irq Modules linked in: sunrpc p4_clockmod ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport e1000 microcode dcdbas sg serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e752x_edac edac_core ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi ata_piix dm_mod [last unloaded: scsi_wait_scan] Pid: 4835, comm: bash Not tainted (2.6.32-132.el6.i686 #1) PowerEdge SC1420 EIP: 0060:[<c069311f>] EFLAGS: 00010096 CPU: 1 EIP is at sysrq_handle_crash+0xf/0x20 EAX: 00000063 EBX: 00000063 ECX: c0a04414 EDX: 00000000 ESI: c0a2c500 EDI: 00000286 EBP: 00000000 ESP: f4577f24 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process bash (pid: 4835, ti=f4576000 task=c15ecab0 task.ti=f4576000) Stack: c069333b c097ed1a c0986b91 00000007 00000000 00000002 f440ee00 c06933a0 <0> fffffffb c06933df b7796000 f6192b00 c05745c4 f4577f9c 00000002 b7796000 <0> f440ee00 00000002 b7796000 c0574560 c0528010 f4577f9c c15ecab0 f4577fb4 Call Trace: [<c069333b>] ? __handle_sysrq+0xfb/0x160 [<c06933a0>] ? write_sysrq_trigger+0x0/0x50 [<c06933df>] ? write_sysrq_trigger+0x3f/0x50 [<c05745c4>] ? proc_reg_write+0x64/0xa0 [<c0574560>] ? proc_reg_write+0x0/0xa0 [<c0528010>] ? vfs_write+0xa0/0x190 [<c0528a91>] ? sys_write+0x41/0x70 [<c0824894>] ? syscall_call+0x7/0xb Code: a2 c0 01 0f b6 41 03 19 d2 f7 d2 83 e2 03 83 e0 cf c1 e2 04 09 d0 88 41 03 f3 c3 90 c7 05 48 43 a0 c0 01 00 00 00 0f ae f8 89 f6 <c6> 05 00 00 00 00 01 c3 89 f6 8d bc 27 00 00 00 00 8d 50 d0 83 EIP: [<c069311f>] sysrq_handle_crash+0xf/0x20 SS:ESP 0068:f4577f24 CR2: 0000000000000000 Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-132.el6.i686 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Tue Apr 12 19:46:26 EDT 2011 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fe8cc00 (usable) BIOS-e820: 000000003fe8cc00 - 000000003fe8ec00 (ACPI NVS) BIOS-e820: 000000003fe8ec00 - 000000003fe90c00 (ACPI data) BIOS-e820: 000000003fe90c00 - 0000000040000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) last_pfn = 0x3fe8c max_arch_pfn = 0x400000 user-defined physical RAM map: user: 0000000000000000 - 0000000000001000 (reserved) user: 0000000000001000 - 00000000000a0000 (usable) user: 00000000000f0000 - 0000000000100000 (reserved) user: 0000000002000000 - 0000000011f5b000 (usable) user: 0000000011f5b400 - 0000000011f60000 (usable) user: 0000000011fff000 - 0000000012000000 (usable) user: 000000003fe8cc00 - 000000003fe90c00 (ACPI data) user: 000000003fe90c00 - 0000000040000000 (reserved) user: 00000000e0000000 - 00000000f0000000 (reserved) user: 00000000fec00000 - 00000000fed00400 (reserved) user: 00000000fed20000 - 00000000feda0000 (reserved) user: 00000000fee00000 - 00000000fef00000 (reserved) user: 00000000ffb00000 - 0000000100000000 (reserved) DMI 2.3 present. SMBIOS version 2.3 @ 0xF0430 last_pfn = 0x12000 max_arch_pfn = 0x400000 x86 PAT enabled: cpu 0, old 0x7010600070106, new 0x7010600070106 total RAM covered: 1023M Found optimal setting for mtrr clean up gran_size: 64K chunk_size: 2M num_reg: 2 lose cover RAM: 0G init_memory_mapping: 0000000000000000-0000000012000000 Using x86 segment limits to approximate NX protection RAMDISK: 11b3e000 - 11f4ffa8 ACPI: RSDP 000fec10 00014 (v00 DELL ) ACPI: RSDT 000fcb6d 00040 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: FACP 000fcbad 00074 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: DSDT fffc1400 02FA2 (v01 DELL dt_ex 00001000 MSFT 0100000D) ACPI: FACS 3fe8cc00 00040 ACPI: SSDT fffc4599 00096 (v01 DELL st_ex 00001000 MSFT 0100000D) ACPI: APIC 000fcc21 0008A (v01 DELL PE1420 00000007 ASL 00000061) ACPI: BOOT 000fccab 00028 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: ASF! 000fccd3 00067 (v16 DELL PE1420 00000007 ASL 00000061) ACPI: MCFG 000fcd3a 0003E (v01 DELL PE1420 00000007 ASL 00000061) ACPI: HPET 000fcd78 00038 (v01 DELL PE1420 00000007 ASL 00000061) 0MB HIGHMEM available. 288MB LOWMEM available. mapped low ram: 0 - 12000000 low ram: 0 - 12000000 node 0 low ram: 00000000 - 12000000 node 0 bootmap 00002000 - 00004400 (9 early reservations) ==> bootmem [0000000000 - 0012000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000] #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000] #3 [0002400000 - 0002befe50] TEXT DATA BSS ==> [0002400000 - 0002befe50] #4 [0011b3e000 - 0011f4ffa8] RAMDISK ==> [0011b3e000 - 0011f4ffa8] #5 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000] #6 [0002bf0000 - 0002c08168] BRK ==> [0002bf0000 - 0002c08168] #7 [0000007000 - 000000a000] PGTABLE ==> [0000007000 - 000000a000] #8 [0000002000 - 0000005000] BOOTMAP ==> [0000002000 - 0000005000] found SMP MP-table at [c00fe710] fe710 Zone PFN ranges: DMA 0x00000001 -> 0x00001000 Normal 0x00001000 -> 0x00012000 HighMem 0x00012000 -> 0x00012000 Movable zone start PFN for each node early_node_map[4] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00002000 -> 0x00011f5b 0: 0x00011f5c -> 0x00011f60 0: 0x00011fff -> 0x00012000 Using APIC driver default ACPI: PM-Timer IO Port: 0x808 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] enabled) ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24]) IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 24-47 ACPI: IOAPIC (id[0x0a] address[0xfec80800] gsi_base[48]) IOAPIC[2]: apic_id 10, version 32, address 0xfec80800, GSI 48-71 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Using ACPI (MADT) for SMP configuration information ACPI: HPET id: 0x8086a201 base: 0xfed00000 SMP: Allowing 4 CPUs, 0 hotplug CPUs PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 PM: Registered nosave memory: 0000000000100000 - 0000000002000000 PM: Registered nosave memory: 0000000011f5b000 - 0000000011f5c000 PM: Registered nosave memory: 0000000011f60000 - 0000000011fff000 Allocating PCI resources starting at 40000000 (gap: 40000000:a0000000) Booting paravirtualized kernel on bare hardware NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 14 pages/cpu @c2200000 s34456 r0 d22888 u524288 pcpu-alloc: s34456 r0 d22888 u524288 alloc=1*2097152 pcpu-alloc: [0] 0 1 2 3 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64959 Kernel command line: ro root=/dev/mapper/vg_dellpesc142001-lv_root rd_LVM_LV=vg_dellpesc142001/lv_root rd_LVM_LV=vg_dellpesc142001/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=ttyS0,115200 irqpoll maxcpus=1 reset_devices cgroup_disable=memory memmap=exactmap memmap=4K$0K memmap=636K@4K memmap=64K$960K memmap=261484K@32768K memmap=19K@294253K memmap=4K@294908K memmap=8K#1047091K memmap=8K#1047099K memmap=1469K$1047107K memmap=262144K$3670016K memmap=1025K$4173824K memmap=512K$4174976K memmap=1024K$4175872K memmap=5120K$4189184K elfcorehdr=294252K Misrouted IRQ fixup and polling support enabled This may significantly impact system performance Disabling memory control group subsystem PID hash table entries: 1024 (order: 0, 4096 bytes) Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 Initializing HighMem for node 0 (00000000:00000000) Memory: 244736k/294912k available (4263k kernel code, 17072k reserved, 2264k data, 524k init, 0k highmem) virtual kernel memory layout: fixmap : 0xffad5000 - 0xfffff000 (5288 kB) pkmap : 0xff600000 - 0xff800000 (2048 kB) vmalloc : 0xd2800000 - 0xff5fe000 ( 717 MB) lowmem : 0xc0000000 - 0xd2000000 ( 288 MB) .init : 0xc2a60000 - 0xc2ae3000 ( 524 kB) .data : 0xc2829f95 - 0xc2a5ffa8 (2264 kB) .text : 0xc2400000 - 0xc2829f95 (4263 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. Hierarchical RCU implementation. NR_IRQS:2304 nr_irqs:1024 Spurious LAPIC timer interrupt on cpu 0 Console: colour VGA+ 80x25 console [ttyS0] enabled HPET: 3 timers in total, 0 timers will be used for per-cpu timer Fast TSC calibration using PIT Detected 3191.967 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 6383.93 BogoMIPS (lpj=3191967) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. Mount-cache hash table entries: 512 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio CPU: Physical Processor ID: 3 CPU: Processor Core ID: 0 mce: CPU supports 4 MCE banks using mwait in idle threads. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code ACPI: Core revision 20090903 <--------------------Hang
Please add "initcall_debug ignore_loglevel " to the second kernel to see if we can get more info.
(In reply to comment #4) > Please add "initcall_debug ignore_loglevel " to the second kernel to see if we > can get more info. Test with RHEL6.2-20110822.5 for four times, once success to reboot, three times hang: =========================================================== BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c069a5cf>] sysrq_handle_crash+0xf/0x20 *pdpt = 0000000034e09001 *pde = 000000003dcc6067 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.2/0000:03:0e.0/irq Modules linked in: sunrpc p4_clockmod ipv6 ppdev parport_pc parport e1000 microcode dcdbas serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support e752x_edac edac_core ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Pid: 2011, comm: bash Not tainted 2.6.32-191.el6.i686 #1 Dell Inc. PowerEdge SC1420 /0T7495 EIP: 0060:[<c069a5cf>] EFLAGS: 00010096 CPU: 3 EIP is at sysrq_handle_crash+0xf/0x20 EAX: 00000063 EBX: 00000063 ECX: c0a34514 EDX: 00000000 ESI: c0a5ce20 EDI: 00000286 EBP: 00000000 ESP: f67fdf24 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process bash (pid: 2011, ti=f67fc000 task=f66c0030 task.ti=f67fc000) Stack: c069a7eb c098df61 c099645a 00000007 00000000 00000002 f6798b00 c069a850 <0> fffffffb c069a88f b73f6000 f65ab880 c0578054 f67fdf9c 00000002 b73f6000 <0> f6798b00 00000002 b73f6000 c0577ff0 c052a2f0 f67fdf9c f66c0030 f67fdfb4 Call Trace: [<c069a7eb>] ? __handle_sysrq+0xfb/0x160 [<c069a850>] ? write_sysrq_trigger+0x0/0x50 [<c069a88f>] ? write_sysrq_trigger+0x3f/0x50 [<c0578054>] ? proc_reg_write+0x64/0xa0 [<c0577ff0>] ? proc_reg_write+0x0/0xa0 [<c052a2f0>] ? vfs_write+0xa0/0x190 [<c052ad41>] ? sys_write+0x41/0x70 [<c0830014>] ? syscall_call+0x7/0xb Code: a5 c0 01 0f b6 41 03 19 d2 f7 d2 83 e2 03 83 e0 cf c1 e2 04 09 d0 88 41 03 f3 c3 90 c7 05 48 44 a3 c0 01 00 00 00 0f ae f8 89 f6 <c6> 05 00 00 00 00 01 c3 89 f6 8d bc 27 00 00 00 00 8d 50 d0 83 EIP: [<c069a5cf>] sysrq_handle_crash+0xf/0x20 SS:ESP 0068:f67fdf24 CR2: 0000000000000000 Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-191.el6.i686 (mockbuild.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Wed Aug 17 20:21:22 EDT 2011 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fe8cc00 (usable) BIOS-e820: 000000003fe8cc00 - 000000003fe8ec00 (ACPI NVS) BIOS-e820: 000000003fe8ec00 - 000000003fe90c00 (ACPI data) BIOS-e820: 000000003fe90c00 - 0000000040000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) debug: ignoring loglevel setting. last_pfn = 0x3fe8c max_arch_pfn = 0x400000 user-defined physical RAM map: user: 0000000000000000 - 0000000000001000 (reserved) user: 0000000000001000 - 00000000000a0000 (usable) user: 00000000000f0000 - 0000000000100000 (reserved) user: 0000000002000000 - 0000000009f5b000 (usable) user: 0000000009f5b400 - 0000000009f60000 (usable) user: 0000000009fff000 - 000000000a000000 (usable) user: 000000003fe8cc00 - 000000003fe90c00 (ACPI data) user: 000000003fe90c00 - 0000000040000000 (reserved) user: 00000000e0000000 - 00000000f0000000 (reserved) user: 00000000fec00000 - 00000000fed00400 (reserved) user: 00000000fed20000 - 00000000feda0000 (reserved) user: 00000000fee00000 - 00000000fef00000 (reserved) user: 00000000ffb00000 - 0000000100000000 (reserved) DMI 2.3 present. SMBIOS version 2.3 @ 0xF0430 DMI: Dell Inc. PowerEdge SC1420 /0T7495, BIOS A00 09/09/2004 e820 update range: 0000000000000000 - 0000000000001000 (usable) ==> (reserved) e820 remove range: 00000000000a0000 - 0000000000100000 (usable) last_pfn = 0xa000 max_arch_pfn = 0x400000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-FFFFF write-protect MTRR variable ranges enabled: 0 base 000000000 mask FC0000000 write-back 1 base 03FF00000 mask FFFF00000 uncachable 2 disabled 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled x86 PAT enabled: cpu 0, old 0x7010600070106, new 0x7010600070106 original variable MTRRs reg 0, base: 0GB, range: 1GB, type WB reg 1, base: 1023MB, range: 1MB, type UC total RAM covered: 1023M Found optimal setting for mtrr clean up gran_size: 64K chunk_size: 2M num_reg: 2 lose cover RAM: 0G New variable MTRRs reg 0, base: 0GB, range: 1GB, type WB reg 1, base: 1023MB, range: 1MB, type UC initial memory mapped : 0 - 03000000 init_memory_mapping: 0000000000000000-000000000a000000 Using x86 segment limits to approximate NX protection 0000000000 - 0000200000 page 4k 0000200000 - 000a000000 page 2M kernel direct mapping tables up to a000000 @ 7000-e000 RAMDISK: 09b69000 - 09f4faf6 ACPI: RSDP 000fec10 00014 (v00 DELL ) ACPI: RSDT 000fcb6d 00040 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: FACP 000fcbad 00074 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: DSDT fffc1400 02FA2 (v01 DELL dt_ex 00001000 MSFT 0100000D) ACPI: FACS 3fe8cc00 00040 ACPI: SSDT fffc4599 00096 (v01 DELL st_ex 00001000 MSFT 0100000D) ACPI: APIC 000fcc21 0008A (v01 DELL PE1420 00000007 ASL 00000061) ACPI: BOOT 000fccab 00028 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: ASF! 000fccd3 00067 (v16 DELL PE1420 00000007 ASL 00000061) ACPI: MCFG 000fcd3a 0003E (v01 DELL PE1420 00000007 ASL 00000061) ACPI: HPET 000fcd78 00038 (v01 DELL PE1420 00000007 ASL 00000061) ACPI: Local APIC address 0xfee00000 0MB HIGHMEM available. 160MB LOWMEM available. mapped low ram: 0 - 0a000000 low ram: 0 - 0a000000 node 0 low ram: 00000000 - 0a000000 node 0 bootmap 00002000 - 00003400 (9 early reservations) ==> bootmem [0000000000 - 000a000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000] #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000] #3 [0002400000 - 0002c3af70] TEXT DATA BSS ==> [0002400000 - 0002c3af70] #4 [0009b69000 - 0009f4faf6] RAMDISK ==> [0009b69000 - 0009f4faf6] #5 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000] #6 [0002c3b000 - 0002c53168] BRK ==> [0002c3b000 - 0002c53168] #7 [0000007000 - 000000a000] PGTABLE ==> [0000007000 - 000000a000] #8 [0000002000 - 0000004000] BOOTMAP ==> [0000002000 - 0000004000] found SMP MP-table at [c00fe710] fe710 Zone PFN ranges: DMA 0x00000001 -> 0x00001000 Normal 0x00001000 -> 0x0000a000 HighMem 0x0000a000 -> 0x0000a000 Movable zone start PFN for each node early_node_map[4] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00002000 -> 0x00009f5b 0: 0x00009f5c -> 0x00009f60 0: 0x00009fff -> 0x0000a000 On node 0 totalpages: 32767 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 127 pages, LIFO batch:0 Normal zone: 288 pages used for memmap Normal zone: 32320 pages, LIFO batch:7 Using APIC driver default ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: NR_CPUS/possible_cpus limit of 1 almost reached. Keeping one slot for boot cpu. Processor 0/0x0 ignored. ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled) ACPI: NR_CPUS/possible_cpus limit of 1 almost reached. Keeping one slot for boot cpu. Processor 1/0x6 ignored. ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled) ACPI: NR_CPUS/possible_cpus limit of 1 almost reached. Keeping one slot for boot cpu. Processor 2/0x1 ignored. ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] enabled) ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24]) IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 24-47 ACPI: IOAPIC (id[0x0a] address[0xfec80800] gsi_base[48]) IOAPIC[2]: apic_id 10, version 32, address 0xfec80800, GSI 48-71 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Using ACPI (MADT) for SMP configuration information ACPI: HPET id: 0x8086a201 base: 0xfed00000 4 Processors exceeds NR_CPUS limit of 1 SMP: Allowing 1 CPUs, 0 hotplug CPUs nr_irqs_gsi: 72 PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 PM: Registered nosave memory: 0000000000100000 - 0000000002000000 PM: Registered nosave memory: 0000000009f5b000 - 0000000009f5c000 PM: Registered nosave memory: 0000000009f60000 - 0000000009fff000 Allocating PCI resources starting at 40000000 (gap: 40000000:a0000000) Booting paravirtualized kernel on bare hardware NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:1 nr_node_ids:1 PERCPU: Embedded 14 pages/cpu @c2200000 s34456 r0 d22888 u2097152 pcpu-alloc: s34456 r0 d22888 u2097152 alloc=1*2097152 pcpu-alloc: [0] 0 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32447 Kernel command line: ro root=/dev/mapper/vg_dellpesc142001-lv_root rd_LVM_LV=vg_dellpesc142001/lv_swap LANG=en_US.UTF-8 console=ttyS0,115200 KEYTABLE=us rd_LVM_LV=vg_dellpesc142001/lv_root rd_NO_MD rd_NO_LUKS rd_NO_DM SYSFONT=latarcyrheb-sun16 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory initcall_debug ignore_loglevel memmap=exactmap memmap=4K$0K memmap=636K@4K memmap=64K$960K memmap=130412K@32768K memmap=19K@163181K memmap=4K@163836K memmap=8K#1047091K memmap=8K#1047099K memmap=1469K$1047107K memmap=262144K$3670016K memmap=1025K$4173824K memmap=512K$4174976K memmap=1024K$4175872K memmap=5120K$4189184K elfcorehdr=163180K Misrouted IRQ fixup and polling support enabled This may significantly impact system performance Disabling memory control group subsystem PID hash table entries: 512 (order: -1, 2048 bytes) Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 Initializing HighMem for node 0 (00000000:00000000) Memory: 113988k/163840k available (4309k kernel code, 16936k reserved, 2416k data, 504k init, 0k highmem) virtual kernel memory layout: fixmap : 0xffad5000 - 0xfffff000 (5288 kB) pkmap : 0xff600000 - 0xff800000 (2048 kB) vmalloc : 0xca800000 - 0xff5fe000 ( 845 MB) lowmem : 0xc0000000 - 0xca000000 ( 160 MB) .init : 0xc2a92000 - 0xc2b10000 ( 504 kB) .data : 0xc2835673 - 0xc2a91708 (2416 kB) .text : 0xc2400000 - 0xc2835673 (4309 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. Hierarchical RCU implementation. NR_IRQS:2304 nr_irqs:256 Spurious LAPIC timer interrupt on cpu 0 Console: colour VGA+ 80x25 console [ttyS0] enabled hpet clockevent registered HPET: 3 timers in total, 0 timers will be used for per-cpu timer Fast TSC calibration using PIT Detected 3192.196 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 6384.39 BogoMIPS (lpj=3192196) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. SELinux: Starting in permissive mode Mount-cache hash table entries: 512 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio Initializing cgroup subsys perf_event CPU: Unsupported number of siblings 2 mce: CPU supports 4 MCE banks CPU0: Thermal LVT vector (0xfa) already installed using mwait in idle threads. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 16k freed ACPI: Core revision 20090903 <==============================Hang here.
Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Hi Chao Ye, I find kdump failed to reboot on dell-pesc1420 not only when failing to save vmcore. Even I set the /etc/kdup.conf normally (without chaning the file of /etc.kdump.conf), the machine still hangs on ACPI: Core revision sometimes. Thanks, Chao Fan
Hi, There are 4 cores in the cpu of this machine. From my test, only when crash on cpu #0, the kdump will be OK. And if crash on cpu #1 or cpu #2 or cpu #3, the machine will hang. So the initail of the cpu may have some problems causing this bug. Thanks, Chao Fan
It looks a same bug as 964279, let's track the issue there. *** This bug has been marked as a duplicate of bug 964279 ***