Bug 440229

Summary: [5.2][xen] Dom0 Kernel panic at BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function()
Product: Red Hat Enterprise Linux 5 Reporter: Qian Cai <qcai>
Component: kernel-xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 5.2   
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-03 06:56:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qian Cai 2008-04-02 12:47:26 UTC
Description of problem:
When booting dom0 kernel, it panic at something like,

Bringing up interface eth0:  BUG: unable to handle kernel paging request at
virtual address 014852a3
 printing eip:
2ccc2000 -> *pde = 00000000:124a0001
2d0a0000 -> *pme = 00000000:00000000
Oops: 0000 [#1]
SMP 
last sysfs file: /class/net/lo/type
Modules linked in: ipv6 xfrm_nalgo crypto_api cpufreq_ondemand dm_multipath
video sbs backlight i2c_ec button battery asus_acpi ac parport_pc lp parport
i2c_i801 i2c_core bnx2 ide_cd i5000_edac edac_mc cdrom sg pcspkr serial_core
dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata aacraid megaraid_sas sd_mod
scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0061:[<c04e7556>]    Not tainted VLI
EFLAGS: 00010082   (2.6.18-87.el5xen #1) 
EIP is at __sync_single+0x1c/0x129
eax: 014852a3   ebx: ffffd0ae   ecx: 00000002   edx: 00000300
esi: 014852a3   edi: 00000002   ebp: 00000000   esp: c071bf20
ds: 007b   es: 007b   ss: 0069
Process readahead (pid: 1814, ti=c071b000 task=c0eaeaa0 task.ti=ec9dc000)
Stack: 00000002 bffff000 00000000 f57a0000 00000300 c1659980 ffffd0ae bffff000 
       00000002 00000000 c04e76e1 00000000 00000002 014852a3 00000300 00000000 
       00000000 00000000 ed6d14f0 c04e794f 00000002 0000001d c0df8848 00000000 
Call Trace:
 [<c04e76e1>] unmap_single+0x3b/0xb6
 [<c04e794f>] swiotlb_unmap_sg+0x10c/0x129
 [<ee069cee>] megasas_unmap_sgbuf+0x2f/0x73 [megaraid_sas]
 [<ee069ec8>] megasas_complete_cmd_dpc+0x16c/0x2e4 [megaraid_sas]
 [<c042613b>] tasklet_action+0x60/0xc2
 [<c0426076>] __do_softirq+0x5e/0xc3
 [<c0406edf>] do_softirq+0x56/0xaf
 [<c0406e80>] do_IRQ+0xa5/0xae
 [<c05495d3>] evtchn_do_upcall+0x64/0x9b
 [<c04055d9>] hypervisor_callback+0x3d/0x48
 [<c0548cec>] force_evtchn_callback+0xa/0xc
 [<c0454755>] __pagevec_lru_add+0x7b/0x93
 [<c048be74>] mpage_readpages+0xb5/0xf9
 [<c0452b2e>] __alloc_pages+0x57/0x297
 [<ee0ab9ce>] ext3_readpages+0x0/0x15 [ext3]
 [<c0453fb6>] __do_page_cache_readahead+0x127/0x1ce
 [<ee0ac52a>] ext3_get_block+0x0/0xbd [ext3]
 [<c0454394>] force_page_cache_readahead+0x50/0x68
 [<c044e1aa>] sys_readahead+0x7f/0x98
 [<c0405413>] syscall_call+0x7/0xb
 =======================
Code: c8 09 d0 5a 0f 94 c0 59 0f b6 c0 5b 5e 5f c3 55 57 56 89 c6 53 83 ec 18 89
4c 24 04 8b 4c 24 30 89 54 24 10 8b 6c 24 2c 89 0c 24 <8b> 08 c1 e9 1e 8b 0c 8d
50 55 6e c0 8b 99 0c 12 00 00 89 44 24 
EIP: [<c04e7556>] __sync_single+0x1c/0x129 SS:ESP 0069:c071bf20
 <0>Kernel panic - not syncing: Fatal exception in interrupt
 BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function() (Not tainted)
 [<c041262f>] smp_call_function+0x59/0xfe
 [<c04126e7>] smp_send_stop+0x13/0x1e
 [<c042147f>] panic+0x4c/0x171
 [<c0406098>] die+0x262/0x296
 [<c060a639>] do_page_fault+0xa7d/0xbf1
 [<c0470155>] end_bio_bh_io_sync+0x0/0x39
 [<c04716df>] bio_put+0x28/0x29
 [<c04693fd>] kmem_cache_free+0x4b/0x84
 [<c04693fd>] kmem_cache_free+0x4b/0x84
 [<c04693fd>] kmem_cache_free+0x4b/0x84
 [<c04e29f3>] vsnprintf+0x41f/0x45d
 [<c0609bbc>] do_page_fault+0x0/0xbf1
 [<c0405597>] error_code+0x2b/0x30
 [<c041007b>] powernowk8_cpu_init+0x381/0xbc2
 [<c04e7556>] __sync_single+0x1c/0x129
 [<c04e76e1>] unmap_single+0x3b/0xb6
 [<c04e794f>] swiotlb_unmap_sg+0x10c/0x129
 [<ee069cee>] megasas_unmap_sgbuf+0x2f/0x73 [megaraid_sas]
 [<ee069ec8>] megasas_complete_cmd_dpc+0x16c/0x2e4 [megaraid_sas]
 [<c042613b>] tasklet_action+0x60/0xc2
 [<c0426076>] __do_softirq+0x5e/0xc3
 [<c0406edf>] do_softirq+0x56/0xaf
 [<c0406e80>] do_IRQ+0xa5/0xae
 [<c05495d3>] evtchn_do_upcall+0x64/0x9b
 [<c04055d9>] hypervisor_callback+0x3d/0x48
 [<c0548cec>] force_evtchn_callback+0xa/0xc
 [<c0454755>] __pagevec_lru_add+0x7b/0x93
 [<c048be74>] mpage_readpages+0xb5/0xf9
 [<c0452b2e>] __alloc_pages+0x57/0x297
 [<ee0ab9ce>] ext3_readpages+0x0/0x15 [ext3]
 [<c0453fb6>] __do_page_cache_readahead+0x127/0x1ce
 [<ee0ac52a>] ext3_get_block+0x0/0xbd [ext3]
 [<c0454394>] force_page_cache_readahead+0x50/0x68
 [<c044e1aa>] sys_readahead+0x7f/0x98
 [<c0405413>] syscall_call+0x7/0xb
 =======================
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

Version-Release number of selected component (if applicable):
RHEL5.2-Server-20080326.0
kernel-xen-2.6.18-87.el5

How reproducible:
I have seen it reproducible on those two machines so far,
dell-pe1850-01.rhts.boston.redhat.com
ibm-defiant.rhts.boston.redhat.com

RHTS jobs,
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=2499786
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=2500284

Additional info:
Not sure if it is related, kernel-xen boot parameters were,

kernel /xen.gz-2.6.18-87.el5 crashkernel=128M@32M nmi_watchdog=1 com1=115200,8n1
module /vmlinuz-2.6.18-87.el5xen ro root=/dev/VolGroup00/LogVol00 console=ttyS0,
115200

Comment 1 Bill Burns 2008-04-02 13:17:58 UTC
Can you try the kernel provided in comment # 27 in bug 433554?
Thanks.

Comment 2 Qian Cai 2008-04-03 06:56:27 UTC

*** This bug has been marked as a duplicate of 433554 ***

Comment 3 Qian Cai 2008-04-03 08:17:42 UTC
Tested, and it fixed the problem here.

Comment 4 Bill Burns 2008-04-03 11:56:56 UTC
Awesome, thanks!