Bug 438677

Summary: [5.2][kdump][xen] capture kernel general protection fault in Dom0
Product: Red Hat Enterprise Linux 5 Reporter: Qian Cai <qcai>
Component: kernelAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Jenner <mjenner>
Severity: low Docs Contact:
Priority: low    
Version: 5.2   
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-22 12:04:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sosreport none

Description Qian Cai 2008-03-24 12:33:00 UTC
Description of problem:
When triggered a dump in Dom0, capture kernel panic. There is no such problem in
normal kernel though.

Red Hat Enterprise Linux Server release 5.2 Beta (Tikanga)
Kernel 2.6.18-86.el5xen on an x86_64

ibm-x3200m2-01.rhts.boston.redhat.com login: PCI-DMA: Out of SW-IOMMU space for
65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 45056 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 36864 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 36864 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 11 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 29 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 28672 bytes at device 0000:08:00.0
printk: 4 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 32768 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 49152 bytes at device 0000:08:00.0
PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:00.0
printk: 179 messages suppressed.
PCI-DMA: Out of SW-IOMMU space for 8192 bytes at device 0000:08:00.0
loaded crasher module
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at /root/kdump/lib/crasher/crasher.c:63
invalid opcode: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:1c.1/0000:03:00.0/irq
CPU 0 
Modules linked in: crasher(U) autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6
xfrm_nalgo crypto_api cpufreq_ondemand dm_multipath video sbs backlight i2c_ec
button battery asus_acpi ac lp joydev sg parport_pc serial_core parport ide_cd
i2c_i801 tg3 shpchp pcspkr i2c_core serio_raw cdrom dm_snapshot dm_zero
dm_mirror dm_mod ata_piix libata mptsas mptscsih mptbase scsi_transport_sas
sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2711, comm: bash Tainted: G      2.6.18-86.el5xen #1
RIP: e030:[<ffffffff884190bc>]  [<ffffffff884190bc>]
:crasher:crasher_write+0x51/0x92
RSP: e02b:ffff88000e021ed8  EFLAGS: 00010246
RAX: 0000000000000031 RBX: ffff88000f550c80 RCX: 0000000000000000
RDX: 00000000fffffff2 RSI: 00002aaaaea03001 RDI: ffff88000e021ee8
RBP: 0000000000000002 R08: ffffffff8841906b R09: 0000000000000001
R10: ffff88000e021e68 R11: 0000000000000000 R12: 00002aaaaea03000
R13: ffff88000e021f50 R14: 0000000000000000 R15: 0000000000000000
FS:  00002aaaab422e10(0000) GS:ffffffff805af000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process bash (pid: 2711, threadinfo ffff88000e020000, task ffff88000ff07100)
Stack:  ffff880000000001  3100000000000000  0000000000000000  ffff88000f550c80 
 0000000000000002  ffffffff802f207a  00002aaaaea03000  ffffffff80216b87 
 c000003e00000001  ffff88000f550c80 
Call Trace:
 [<ffffffff802f207a>] proc_file_write+0x27/0x2b
 [<ffffffff80216b87>] vfs_write+0xce/0x174
 [<ffffffff802173bf>] sys_write+0x45/0x6e
 [<ffffffff802602f1>] tracesys+0xa7/0xb2


Code: 0f 0b 68 5d 91 41 88 c2 3f 00 eb 2a c6 04 25 01 00 00 00 41 
RIP  [<ffffffff884190bc>] :crasher:crasher_write+0x51/0x92
 RSP <ffff88000e021ed8>
Linux version 2.6.18-86.el5 (brewbuilder.redhat.com) (gcc
version 4.1.2 20070626 (Red Hat 4.1.2-14)) #1 SMP Tue Mar 18 18:19:59 EDT 2008
Command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200 irqpoll
maxcpus=1 reset_devices memmap=exactmap memmap=640K@0K memmap=5116K@32768K
memmap=125300K@38524K elfcorehdr=163824K memmap=72K#521664K memmap=12K#521736K
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 000000000009cc00 (usable)
 BIOS-e820: 000000000009cc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000ce000 - 00000000000d4000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fd70000 (usable)
 BIOS-e820: 000000001fd70000 - 000000001fd82000 (ACPI data)
 BIOS-e820: 000000001fd82000 - 000000001fd85000 (ACPI NVS)
 BIOS-e820: 000000001fd85000 - 0000000020000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000002000000 - 00000000024ff000 (usable)
 user: 000000000259f000 - 0000000009ffc000 (usable)
 user: 000000001fd70000 - 000000001fd82000 (ACPI data)
 user: 000000001fd82000 - 000000001fd85000 (ACPI data)
DMI present.
No NUMA configuration found
Faking a node at 0000000000000000-0000000009ffc000
Bootmem setup node 0 0000000000000000-0000000009ffc000
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
ACPI: PM-Timer IO Port: 0x1008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:15 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 6:15 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Setting APIC routing to physical flat
ACPI: HPET id: 0xffffffff base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Nosave address range: 00000000000a0000 - 0000000002000000
Nosave address range: 00000000024ff000 - 000000000259f000
Allocating PCI resources starting at 20000000 (gap: 1fd85000:e027b000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
Built 1 zonelists.  Total pages: 32195
Kernel command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200
irqpoll maxcpus=1 reset_devices memmap=exactmap memmap=640K@0K
memmap=5116K@32768K memmap=125300K@38524K elfcorehdr=163824K memmap=72K#521664K
memmap=12K#521736K
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
Initializing CPU#0
PID hash table entries: 512 (order: 9, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 16384 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
Checking aperture...
Memory: 118832k/163824k available (2458k kernel code, 12224k reserved, 1244k
data, 196k init)
Calibrating delay using timer specific routine.. 6011.27 BogoMIPS (lpj=3005637)
general protection fault: 0000 [1] SMP 
last sysfs file: 
CPU 0 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.18-86.el5 #1
RIP: 0010:[<ffffffff8001a7ef>]  [<ffffffff8001a7ef>] alloc_arraycache+0x2a/0x56
RSP: 0018:ffffffff803cf658  EFLAGS: 00010206
RAX: 0608942c12870000 RBX: 0000000000000078 RCX: 0000000000000000
RDX: 0608942c12870000 RSI: 0000000000000001 RDI: ffff8100029983d8
RBP: 000000000000003c R08: 0000000000000000 R09: ffffffff802e3ae0
R10: ffffffff803cf6b0 R11: 00000000000007f8 R12: ffff81000299a200
R13: 000000000000003c R14: 0000000000000078 R15: 0000000000000008
FS:  0000000000000000(0000) GS:ffffffff8039e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaaaad6c8d CR3: 0000000002001000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffffffff803ce000, task ffffffff802e3ae0)
Stack:  0000000000000000 ffff8100029951c0 ffff810002996e00 0000000000000000
 0000000000000000 ffffffff800d438e ffffffff803cf708 0000000000000060
 0000000000000000 0000000000000980 ffff8100029951c0 0000000000000000
Call Trace:
 [<ffffffff800d438e>] do_tune_cpucache+0x51/0x3f5
 [<ffffffff8005be04>] cache_alloc_refill+0x106/0x186
 [<ffffffff800d48db>] enable_cpucache+0x4f/0x7b
 [<ffffffff800390c7>] kmem_cache_create+0x3aa/0x5a6
 [<ffffffff800adfcf>] cpuset_mems_allowed+0x45/0x4d
 [<ffffffff803eb099>] pidmap_init+0x42/0x4b
 [<ffffffff803d978d>] start_kernel+0x1a7/0x225
 [<ffffffff803d9237>] _sinittext+0x237/0x23e


Code: c7 00 00 00 00 00 89 58 04 89 68 08 c7 40 0c 00 00 00 00 c7 
RIP  [<ffffffff8001a7ef>] alloc_arraycache+0x2a/0x56
 RSP <ffffffff803cf658>
 <0>Kernel panic - not syncing: Fatal exception


Version-Release number of selected component (if applicable):
RHEL5.2-Server-20080320.0 (x86_64)
kernel-xen-2.6.18-86.el5xen
kernel-2.6.18-86.el5
kexec-tools-1.102pre-15.el5

How reproducible:
Always on ibm-x3200m2-01.rhts.boston.redhat.com.

Steps to Reproduce:
- Configured kdump on kernel-xen with crashkernel=128M@32M
- sysrq-c

Comment 1 Qian Cai 2008-03-24 12:33:00 UTC
Created attachment 298885 [details]
sosreport

Comment 2 Qian Cai 2008-03-24 12:38:23 UTC
sorry, not sysrq-c to reproduce this bug. It has to be triggered by the following,

wget http://porkchop.devel.redhat.com/qa/rhts/lookaside/ltp-kdump-20080228.tar.gz
tar zxvf ltp-kdump-20080228.tar.gz
cd kdump/lib/crasher
make
insmod crasher.ko
echo 1 >/proc/crasher

Comment 3 Qian Cai 2008-10-22 12:04:57 UTC
I'll close this, as it has been fixed magically in the latest RHEL 5.3 Beta candidate tree.