Bug 438677 - [5.2][kdump][xen] capture kernel general protection fault in Dom0
[5.2][kdump][xen] capture kernel general protection fault in Dom0
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.2
All Linux
low Severity low
: rc
: ---
Assigned To: Xen Maintainance List
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-24 08:33 EDT by CAI Qian
Modified: 2008-10-22 08:04 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-22 08:04:57 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sosreport (2.25 MB, application/octet-stream)
2008-03-24 08:33 EDT, CAI Qian
no flags Details

  None (edit)
Description CAI Qian 2008-03-24 08:33:00 EDT
Description of problem:
When triggered a dump in Dom0, capture kernel panic. There is no such problem in
normal kernel though.

Red Hat Enterprise Linux Server release 5.2 Beta (Tikanga)
Kernel 2.6.18-86.el5xen on an x86_64

ibm-x3200m2-01.rhts.boston.redhat.com login: PCI-DMA: Out of SW-IOMMU space for
65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 45056 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 36864 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 36864 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 11 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 29 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 28672 bytes at device 0000:08:00.0
printk: 4 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 32768 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 179 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 8192 bytes at device 0000:08:00.0
loaded crasher module
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at /root/kdump/lib/crasher/crasher.c:63
invalid opcode: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:1c.1/0000:03:00.0/irq
CPU 0 
Modules linked in: crasher(U) autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6
xfrm_nalgo crypto_api cpufreq_ondemand dm_multipath video sbs backlight i2c_ec
button battery asus_acpi ac lp joydev sg parport_pc serial_core parport ide_cd
i2c_i801 tg3 shpchp pcspkr i2c_core serio_raw cdrom dm_snapshot dm_zero
dm_mirror dm_mod ata_piix libata mptsas mptscsih mptbase scsi_transport_sas
sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2711, comm: bash Tainted: G      2.6.18-86.el5xen #1
RIP: e030:[<ffffffff884190bc>]  [<ffffffff884190bc>]
:crasher:crasher_write+0x51/0x92
RSP: e02b:ffff88000e021ed8  EFLAGS: 00010246
RAX: 0000000000000031 RBX: ffff88000f550c80 RCX: 0000000000000000
RDX: 00000000fffffff2 RSI: 00002aaaaea03001 RDI: ffff88000e021ee8
RBP: 0000000000000002 R08: ffffffff8841906b R09: 0000000000000001
R10: ffff88000e021e68 R11: 0000000000000000 R12: 00002aaaaea03000
R13: ffff88000e021f50 R14: 0000000000000000 R15: 0000000000000000
FS:  00002aaaab422e10(0000) GS:ffffffff805af000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process bash (pid: 2711, threadinfo ffff88000e020000, task ffff88000ff07100)
Stack:  ffff880000000001  3100000000000000  0000000000000000  ffff88000f550c80 
 0000000000000002  ffffffff802f207a  00002aaaaea03000  ffffffff80216b87 
 c000003e00000001  ffff88000f550c80 
Call Trace:
 [<ffffffff802f207a>] proc_file_write+0x27/0x2b
 [<ffffffff80216b87>] vfs_write+0xce/0x174
 [<ffffffff802173bf>] sys_write+0x45/0x6e
 [<ffffffff802602f1>] tracesys+0xa7/0xb2


Code: 0f 0b 68 5d 91 41 88 c2 3f 00 eb 2a c6 04 25 01 00 00 00 41 
RIP  [<ffffffff884190bc>] :crasher:crasher_write+0x51/0x92
 RSP <ffff88000e021ed8>
Linux version 2.6.18-86.el5 (brewbuilder@hs20-bc2-4.build.redhat.com) (gcc
version 4.1.2 20070626 (Red Hat 4.1.2-14)) #1 SMP Tue Mar 18 18:19:59 EDT 2008
Command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200 irqpoll
maxcpus=1 reset_devices memmap=exactmap memmap=640K@0K memmap=5116K@32768K
memmap=125300K@38524K elfcorehdr=163824K memmap=72K#521664K memmap=12K#521736K
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 000000000009cc00 (usable)
 BIOS-e820: 000000000009cc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000ce000 - 00000000000d4000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fd70000 (usable)
 BIOS-e820: 000000001fd70000 - 000000001fd82000 (ACPI data)
 BIOS-e820: 000000001fd82000 - 000000001fd85000 (ACPI NVS)
 BIOS-e820: 000000001fd85000 - 0000000020000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000002000000 - 00000000024ff000 (usable)
 user: 000000000259f000 - 0000000009ffc000 (usable)
 user: 000000001fd70000 - 000000001fd82000 (ACPI data)
 user: 000000001fd82000 - 000000001fd85000 (ACPI data)
DMI present.
No NUMA configuration found
Faking a node at 0000000000000000-0000000009ffc000
Bootmem setup node 0 0000000000000000-0000000009ffc000
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
ACPI: PM-Timer IO Port: 0x1008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:15 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 6:15 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Setting APIC routing to physical flat
ACPI: HPET id: 0xffffffff base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Nosave address range: 00000000000a0000 - 0000000002000000
Nosave address range: 00000000024ff000 - 000000000259f000
Allocating PCI resources starting at 20000000 (gap: 1fd85000:e027b000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
Built 1 zonelists.  Total pages: 32195
Kernel command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200
irqpoll maxcpus=1 reset_devices memmap=exactmap memmap=640K@0K
memmap=5116K@32768K memmap=125300K@38524K elfcorehdr=163824K memmap=72K#521664K
memmap=12K#521736K
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
Initializing CPU#0
PID hash table entries: 512 (order: 9, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 16384 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
Checking aperture...
Memory: 118832k/163824k available (2458k kernel code, 12224k reserved, 1244k
data, 196k init)
Calibrating delay using timer specific routine.. 6011.27 BogoMIPS (lpj=3005637)
general protection fault: 0000 [1] SMP 
last sysfs file: 
CPU 0 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.18-86.el5 #1
RIP: 0010:[<ffffffff8001a7ef>]  [<ffffffff8001a7ef>] alloc_arraycache+0x2a/0x56
RSP: 0018:ffffffff803cf658  EFLAGS: 00010206
RAX: 0608942c12870000 RBX: 0000000000000078 RCX: 0000000000000000
RDX: 0608942c12870000 RSI: 0000000000000001 RDI: ffff8100029983d8
RBP: 000000000000003c R08: 0000000000000000 R09: ffffffff802e3ae0
R10: ffffffff803cf6b0 R11: 00000000000007f8 R12: ffff81000299a200
R13: 000000000000003c R14: 0000000000000078 R15: 0000000000000008
FS:  0000000000000000(0000) GS:ffffffff8039e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaaaad6c8d CR3: 0000000002001000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffffffff803ce000, task ffffffff802e3ae0)
Stack:  0000000000000000 ffff8100029951c0 ffff810002996e00 0000000000000000
 0000000000000000 ffffffff800d438e ffffffff803cf708 0000000000000060
 0000000000000000 0000000000000980 ffff8100029951c0 0000000000000000
Call Trace:
 [<ffffffff800d438e>] do_tune_cpucache+0x51/0x3f5
 [<ffffffff8005be04>] cache_alloc_refill+0x106/0x186
 [<ffffffff800d48db>] enable_cpucache+0x4f/0x7b
 [<ffffffff800390c7>] kmem_cache_create+0x3aa/0x5a6
 [<ffffffff800adfcf>] cpuset_mems_allowed+0x45/0x4d
 [<ffffffff803eb099>] pidmap_init+0x42/0x4b
 [<ffffffff803d978d>] start_kernel+0x1a7/0x225
 [<ffffffff803d9237>] _sinittext+0x237/0x23e


Code: c7 00 00 00 00 00 89 58 04 89 68 08 c7 40 0c 00 00 00 00 c7 
RIP  [<ffffffff8001a7ef>] alloc_arraycache+0x2a/0x56
 RSP <ffffffff803cf658>
 <0>Kernel panic - not syncing: Fatal exception


Version-Release number of selected component (if applicable):
RHEL5.2-Server-20080320.0 (x86_64)
kernel-xen-2.6.18-86.el5xen
kernel-2.6.18-86.el5
kexec-tools-1.102pre-15.el5

How reproducible:
Always on ibm-x3200m2-01.rhts.boston.redhat.com.

Steps to Reproduce:
- Configured kdump on kernel-xen with crashkernel=128M@32M
- sysrq-c
Comment 1 CAI Qian 2008-03-24 08:33:00 EDT
Created attachment 298885 [details]
sosreport
Comment 2 CAI Qian 2008-03-24 08:38:23 EDT
sorry, not sysrq-c to reproduce this bug. It has to be triggered by the following,

wget http://porkchop.devel.redhat.com/qa/rhts/lookaside/ltp-kdump-20080228.tar.gz
tar zxvf ltp-kdump-20080228.tar.gz
cd kdump/lib/crasher
make
insmod crasher.ko
echo 1 >/proc/crasher
Comment 3 CAI Qian 2008-10-22 08:04:57 EDT
I'll close this, as it has been fixed magically in the latest RHEL 5.3 Beta candidate tree.

Note You need to log in before you can comment on or make changes to this bug.