Description of problem: Capture kernel failed to capture a vmcore for IBM eServer x3455. It reseted to BIOS after the following messages, ... powernow-k8: Pre-initialization of ACPI failed powernow-k8: Found 1 Dual-Core AMD Opteron(tm) Processor 2220 SE processors (1 cpu cores) (version 2.20.00) powernow-k8: BIOS error - no PSB or ACPI _PSS objects ACPI: (supports S0 S4 S5) Freeing unused kernel memory: 196k freed Write protecting the kernel read-only data: 475k Mounting proc filesystem Mounting sysfs filesystem Creating /dev Creating initial device nodes Loading scsi_mod.ko module SCSI subsystem initialized Loading sd_mod.ko module Loading libata.ko module Loading sata_svw.ko module ACPI: PCI Interrupt Link [LNKS] enabled at IRQ 10 ACPI: PCI Interrupt 0000:01:0e.0[A] -> Link [LNKS] -> GSI 10 (level, low) -> IRQ 10 scsi0 : sata_svw scsi1 : sata_svw scsi2 : sata_svw scsi3 : sata_svw ata1: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100000 irq 10 ata2: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100100 irq 10 ata3: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100200 irq 10 ata4: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100300 irq 10 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) I have tried "noacpi" to kdump kernel command-line, and RHEL5U1 version of kexec-tools without luck. Note that the problem is only triggered by certain crash scenarios. For example, LKDTM (Linux Kernel Dump Test Module)'s bug in do_irq(). Simple "echo c >/proc/sysrq-trigger" works perfect fine without problem. I have not tried this with i386 or RHEL5U1 kernel yet. Version-Release number of selected component (if applicable): RHEL5.2-Server-20080326.0 kernel-2.6.18-87.el5 kexec-tools-1.102pre-16.el5 How reproducible: Always on ibm-pizzaro.rhts.boston.redhat.com Steps to Reproduce: 1. configured kdump and booted the kernel with crashkernel=128M@16M. 2. wget http://porkchop.devel.redhat.com/qa/rhts/lookaside/ltp-kdump-20080228.tar.gz; cd kdump/lib/lkdtm; export USE_SYMBOL_NAME=1; make 3. insmod lkdtm.ko cpoint_name=INT_HARDWARE_ENTRY cpoint_type=BUG cpoint_count=05
Created attachment 300229 [details] sosreport
Created attachment 300234 [details] Full serial console log
Neil suggested the following patch might be helpful here. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4bfaaef01a1badb9e8ffb0c0a37cd2379008d21f
Same thing happened for i386 as well.
I have tried the following options to Kdump Kernel options, and it does not help. "hda=noprobe hdb=noprobe hdc=noprobe hdd=noprobe" "ide0=noprobe ide1=noprobe ide2=noprobe ide3=noprobe" So, I suppose I'll wait this machine gets BIOS updated first.
I'll close this out, as using jprobe() to trigger artificial crashes probably not a good way to test Kdump. I'll create a new Kernel module to test those scenarios and open new BZs for any issue found.