This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 440399 - [5.2][kdump] capture kernel reset for IBM eServer x3455
[5.2][kdump] capture kernel reset for IBM eServer x3455
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.2
All Linux
low Severity low
: rc
: ---
Assigned To: Ed Pollard
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-04-03 07:56 EDT by CAI Qian
Modified: 2013-08-05 20:04 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-22 06:34:02 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
sosreport (2.21 MB, application/octet-stream)
2008-04-03 07:56 EDT, CAI Qian
no flags Details
Full serial console log (41.01 KB, text/plain)
2008-04-03 07:59 EDT, CAI Qian
no flags Details

  None (edit)
Description CAI Qian 2008-04-03 07:56:07 EDT
Description of problem:
Capture kernel failed to capture a vmcore for IBM eServer x3455. It reseted to
BIOS after the following messages,

...
powernow-k8: Pre-initialization of ACPI failed
powernow-k8: Found 1 Dual-Core AMD Opteron(tm) Processor 2220 SE processors (1
cpu cores) (version 2.20.00)
powernow-k8: BIOS error - no PSB or ACPI _PSS objects
ACPI: (supports S0 S4 S5)
Freeing unused kernel memory: 196k freed
Write protecting the kernel read-only data: 475k
Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Loading scsi_mod.ko module
SCSI subsystem initialized
Loading sd_mod.ko module
Loading libata.ko module
Loading sata_svw.ko module
ACPI: PCI Interrupt Link [LNKS] enabled at IRQ 10
ACPI: PCI Interrupt 0000:01:0e.0[A] -> Link [LNKS] -> GSI 10 (level, low) -> IRQ 10
scsi0 : sata_svw
scsi1 : sata_svw
scsi2 : sata_svw
scsi3 : sata_svw
ata1: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100000 irq 10
ata2: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100100 irq 10
ata3: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100200 irq 10
ata4: SATA max UDMA/133 mmio m8192@0xd8100000 port 0xd8100300 irq 10
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

I have tried "noacpi" to kdump kernel command-line, and RHEL5U1 version of
kexec-tools without luck. Note that the problem is only triggered by certain
crash scenarios. For example, LKDTM (Linux Kernel Dump Test Module)'s bug in
do_irq(). Simple "echo c >/proc/sysrq-trigger" works perfect fine without
problem. I have not tried this with i386 or RHEL5U1 kernel yet.

Version-Release number of selected component (if applicable):
RHEL5.2-Server-20080326.0
kernel-2.6.18-87.el5
kexec-tools-1.102pre-16.el5

How reproducible:
Always on ibm-pizzaro.rhts.boston.redhat.com

Steps to Reproduce:
1. configured kdump and booted the kernel with crashkernel=128M@16M.
2. wget
http://porkchop.devel.redhat.com/qa/rhts/lookaside/ltp-kdump-20080228.tar.gz; cd
kdump/lib/lkdtm; export USE_SYMBOL_NAME=1; make
3. insmod lkdtm.ko cpoint_name=INT_HARDWARE_ENTRY cpoint_type=BUG cpoint_count=05
Comment 1 CAI Qian 2008-04-03 07:56:08 EDT
Created attachment 300229 [details]
sosreport
Comment 2 CAI Qian 2008-04-03 07:59:01 EDT
Created attachment 300234 [details]
Full serial console log
Comment 3 CAI Qian 2008-04-03 08:00:34 EDT
Neil suggested the following patch might be helpful here.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4bfaaef01a1badb9e8ffb0c0a37cd2379008d21f
Comment 4 CAI Qian 2008-04-03 11:15:26 EDT
Same thing happened for i386 as well.
Comment 5 CAI Qian 2008-07-16 06:09:34 EDT
I have tried the following options to Kdump Kernel options, and it does not help.

"hda=noprobe hdb=noprobe hdc=noprobe hdd=noprobe"
"ide0=noprobe ide1=noprobe ide2=noprobe ide3=noprobe"

So, I suppose I'll wait this machine gets BIOS updated first.
Comment 6 CAI Qian 2008-10-22 06:34:02 EDT
I'll close this out, as using jprobe() to trigger artificial crashes probably not a good way to test Kdump. I'll create a new Kernel module to test those scenarios and open new BZs for any issue found.

Note You need to log in before you can comment on or make changes to this bug.