Bug 442670 - [5.2][kdump][xen] kdump on Dom0 Kernel not work properly on ibm-x3200m2-01
[5.2][kdump][xen] kdump on Dom0 Kernel not work properly on ibm-x3200m2-01
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
All Linux
low Severity low
: rc
: ---
Assigned To: Xen Maintainance List
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2008-04-16 03:06 EDT by CAI Qian
Modified: 2008-10-22 08:32 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-10-22 08:32:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description CAI Qian 2008-04-16 03:06:43 EDT
Description of problem:
Kdump on Dom0 Kernel does not work properly on
ibm-x3200m2-01.rhts.boston.redhat.com. Each time starting kdump service on this
box, there is something suspicious,

printk: 27263 messages suppressed.
4gb seg fixup, process ldd (pid 5538), cs:ip 73:001d53dd
4gb seg fixup, process ldd (pid 5538), cs:ip 73:001d53dd
4gb seg fixup, process ldd (pid 5538), cs:ip 73:001d53dd
4gb seg fixup, process ldd (pid 5538), cs:ip 73:001d53dd

I have seen several random failures. Capture kernel could Oops,

mptbase: ioc0: Initiating bringup
ioc0: LSISAS1064E B1: Capabilities={Initiator}
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
*pde = 02302001
Oops: 0002 [#1]
last sysfs file: 
Modules linked in: mptsas scsi_transport_sas mptscsih sd_mod scsi_mod mptbase
CPU:    0
EIP:    0060:[<ca847022>]    Not tainted VLI
EFLAGS: 00010206   (2.6.18-89.el5PAE #1) 
EIP is at mpt_findImVolumes+0x4d3/0x525 [mptbase]
eax: 00000000   ebx: c9f11000   ecx: c227a800   edx: c227a88c
esi: c9f11040   edi: c9eec800   ebp: 09f11000   esp: c9e18b04
ds: 007b   es: 007b   ss: 0068
Process exe (pid: 430, ti=c9e18000 task=c9e19aa0 task.ti=c9e18000)
Stack: c9eec800 00000000 c9f12000 0000008c 00000282 c202ddce c9e18b44 ffffffff 
       00100100 00200200 00000000 00200200 fffbb1d4 ca848255 c9eec800 c239d080 
       c9e18c24 09f11000 00000000 00000001 00000000 c2000001 00000000 c9eec800 
Call Trace:
 [<c202ddce>] lock_timer_base+0x15/0x2f
 [<ca848255>] mpt_timer_expired+0x0/0x4e [mptbase]
 [<c202e1d8>] msleep+0x17/0x1c
 [<ca8436c9>] WaitForDoorbellInt+0x37/0x95 [mptbase]
 [<ca843a43>] mpt_handshake_req_reply_wait+0x298/0x3d0 [mptbase]
 [<ca8443c6>] SendIocInit+0x2ce/0x3ba [mptbase]
 [<ca848255>] mpt_timer_expired+0x0/0x4e [mptbase]
 [<ca8477fa>] mpt_do_ioc_recovery+0x786/0x107e [mptbase]
 [<c20e4f50>] __delay+0x6/0x7
 [<c2206f54>] schedule+0x920/0x9cd
 [<c21a3d24>] pci_read+0x1c/0x21
 [<c2021ab6>] __cond_resched+0x16/0x34
 [<c220702b>] cond_resched+0x2a/0x31
 [<c201789b>] smp_call_function+0x23/0xc3
 [<c206494e>] __get_vm_area_node+0xa6/0x165
 [<c2017a86>] do_flush_tlb_all+0x0/0x5a
 [<c202a551>] on_each_cpu+0x17/0x1f
 [<c21a2b47>] pci_conf1_read+0xa4/0xad
 [<c21a3d24>] pci_read+0x1c/0x21
 [<c2038c81>] down_read+0x8/0x11
 [<ca849e3e>] mpt_attach+0xa4e/0xb2e [mptbase]
 [<c214ccad>] __driver_attach+0x0/0x6b
 [<ca86fa52>] mptsas_probe+0x10/0x3fb [mptsas]
 [<c20eda28>] pci_match_device+0x10/0xac
 [<c214ccad>] __driver_attach+0x0/0x6b
 [<c20edb10>] pci_device_probe+0x36/0x57
 [<c214cc00>] driver_probe_device+0x42/0x92
 [<c214ccf1>] __driver_attach+0x44/0x6b
 [<c214c6fe>] bus_for_each_dev+0x37/0x59
 [<c214cb6a>] driver_attach+0x11/0x13
 [<c214ccad>] __driver_attach+0x0/0x6b
 [<c214c406>] bus_add_driver+0x64/0xfd
 [<c20edc35>] __pci_register_driver+0x3e/0x58
 [<ca83b0b5>] mptsas_init+0xb5/0xc9 [mptsas]
 [<c203e859>] sys_init_module+0x18b5/0x1a60
 [<c207ae01>] permission+0xa2/0xb5
 [<ca82af52>] sas_release_transport+0x0/0x47 [scsi_transport_sas]
 [<c200946a>] sys_mmap2+0x99/0xa3
 [<c2004eff>] syscall_call+0x7/0xb
Code: 94 24 21 01 00 00 ff b4 24 14 01 00 00 0f 45 c1 ff b4 24 14 01 00 00 89 d9
c1 e2 ff ff ff ff ff ff 00 07 e9 d8 17 98 08 06 00 01 <08> 00 06 04 00 01 00 07
e9 d8 17 98 c0 a8 4f 93 ff ff ff ff ff 
EIP: [<ca847022>] mpt_findImVolumes+0x4d3/0x525 [mptbase] SS:ESP 0068:c9e18b04
 <0>Kernel panic - not syncing: Fatal exception

Capture kernel could soft lockup,

BUG: soft lockup - CPU#0 stuck for 10s! [ifconfig:1151]

Pid: 1151, comm:             ifconfig
EIP: 0060:[<c2208701>] CPU: 0
EIP is at _spin_lock_bh+0x12/0x18
 EFLAGS: 00000286    Not tainted  (2.6.18-89.el5PAE #1)
EAX: c89be000 EBX: c9c1c860 ECX: 00000000 EDX: 00203100
ESI: 00000000 EDI: 00000218 EBP: 00000860 DS: 007b ES: 007b
CR0: 80050033 CR2: 08205698 CR3: 0231f9c0 CR4: 000006f0
 [<c21c453e>] rt_run_flush+0x47/0x8f
 [<c22f30c0>] powernow_cpu_init+0x2fd/0x568
 [<c21ebe62>] ip_mc_inc_group+0x168/0x194
 [<c22f30c0>] powernow_cpu_init+0x2fd/0x568
 [<c22f317c>] powernow_cpu_init+0x3b9/0x568
 [<c22f30c0>] powernow_cpu_init+0x2fd/0x568
 [<c21ebec7>] ip_mc_up+0x39/0x4e
 [<c22f3124>] powernow_cpu_init+0x361/0x568
 [<c21e7eac>] inetdev_init+0xe5/0x101
 [<c21e89a5>] devinet_ioctl+0x3a8/0x542
 [<c21a4dd7>] sock_ioctl+0x191/0x1b3
 [<c21a4c46>] sock_ioctl+0x0/0x1b3
 [<c207ecec>] do_ioctl+0x1c/0x5d
 [<c207ef77>] vfs_ioctl+0x24a/0x25c
 [<c207efd1>] sys_ioctl+0x48/0x5f
 [<c2004eff>] syscall_call+0x7/0xb

Even if capture kernel booted successfully by chance, it may still failed to
save vmcore to a network target.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. reserve intel-s6e5231-01.rhts.boston.redhat.com
2. export intel-s6e5231-01.rhts.boston.redhat.com:/mnt as NFS share
3. run automated test /kernel/distribution/kexec-tools/net
Comment 1 CAI Qian 2008-04-16 03:11:23 EDT
Sometimes, capture kernel panic due to,

NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
ACPI: (supports<6>Time: tsc clocksource has been installed.
 S0 S1 S4 S5)
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
RAMDISK: Compressed image found at block 0
crc error
VFS: Cannot open root device "VolGroup00/LogVol00" or unknown-block(0,0)
Please append a correct "root=" boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

Comment 2 CAI Qian 2008-10-22 08:05:11 EDT
I'll close this, as it has been fixed magically in the latest RHEL 5.3 Beta candidate tree.

Note You need to log in before you can comment on or make changes to this bug.