Bug 229060

Summary: post-xen migration bug
Product: [Fedora] Fedora Reporter: Brian Brock <bbrock>
Component: kernel-xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: bstein, mjenner
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-26 23:55:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Brian Brock 2007-02-16 19:06:53 UTC
Description of problem:
The guest kernel appears to BUG() at some point during a xen migration, or
immediately thereafter.



Version-Release number of selected component (if applicable):
kernel-xen 2.6.18-1.3002.el5xen
xen-3.0.3-25.el5
libvirt-0.1.8-15.el5
fc6 GOLD guest

old host has 2G ram
new host has 8G ram

reproducibility not yet known

setup:
create a guest running FC-6 GOLD paravirt
shutdown the guest, move the image to an nfs server
mount the nfs server by both old and new dom0
start guest on old dom0
run `xm migrate $domid $newhost`
after domain has successfully transfered, run `xm console $domid` on the new dom0.

`xm console` immediately generated the following:

------------[ cut here ]------------
kernel BUG at drivers/xen/netfront/netfront.c:766!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /block/dm-1/range
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc xennet ipv6
dm_multipath parport_pc lp parport pcspkr dm_snapshot dm_zero dm_mirror dm_mod
xenblk ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0061:[<d4116b30>]    Not tainted VLI
EFLAGS: 00010082   (2.6.18-1.2798.fc6xen #1) 
EIP is at network_alloc_rx_buffers+0x1a0/0x3f9 [xennet]
eax: cd8a0000   ebx: cd8aad80   ecx: 0000d8ab   edx: d1970000
esi: d1970400   edi: 00000000   ebp: d19734b0   esp: c0c70f0c
ds: 007b   es: 007b   ss: 0069
Process xenwatch (pid: 9, ti=c0c70000 task=c0f8c5e0 task.ti=c0c70000)
Stack: d4118d09 d1970000 00001d61 00000100 00000208 cd8ab000 000000dd d1f9c838 
       c0c70f4c 00000023 000000dd d4116e4c 00000000 d2641c80 0000002e 0000002e 
       d1970400 d1970000 d1f9c150 0000013a d411708c cf1027e7 c0ce2200 00000023 
Call Trace:
 [<d411708c>] backend_changed+0x1b0/0x1fb [xennet]
 [<c054c3fa>] otherend_changed+0x74/0x79
 [<c054a94b>] xenwatch_handle_callback+0x12/0x44
 [<c054b414>] xenwatch_thread+0x108/0x11e
 [<c042df67>] kthread+0xc0/0xed
 [<c0402a69>] kernel_thread_helper+0x5/0xb
DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb

                    Leftover inexact backtrace:

                     =======================
Code: 14 66 0f b6 44 24 0c 0f b7 f8 66 89 44 24 0a 83 bc be fc 04 00 00 00 74 1a
eb 10 83 7c 24 18 00 0f 84 17 02 00 00 e9 65 01 00 00 <0f> 0b fe 02 cc 8b 11 d4
89 9c be fc 04 00 00 8d 86 04 0d 00 00      
EIP: [<d4116b30>] network_alloc_rx_buffers+0x1a0/0x3f9 [xennet] SS:ESP 0069:c0c70f0c
 <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
 [<c040571a>] dump_trace+0x69/0x1af
 [<c0405878>] show_trace_log_lvl+0x18/0x2c
 [<c0405e18>] show_trace+0xf/0x11
 [<c0405e47>] dump_stack+0x15/0x17
 [<c0430a96>] down_read+0x12/0x20
 [<c0428ba5>] blocking_notifier_call_chain+0xe/0x29
 [<c041ec7d>] do_exit+0x1b/0x776
 [<c0405db9>] die+0x289/0x2ae
 [<c04063c2>] do_invalid_op+0xa2/0xab
 [<c040502b>] error_code+0x2b/0x30
DWARF2 unwinder stuck at error_code+0x2b/0x30

Leftover inexact backtrace:

 [<c046007b>] shmem_getpage+0x250/0x584
 [<d4116b30>] network_alloc_rx_buffers+0x1a0/0x3f9 [xennet]
 [<d4116e4c>] xennet_set_tso+0x52/0x5f [xennet]
 [<d411708c>] backend_changed+0x1b0/0x1fb [xennet]
 [<c054c3fa>] otherend_changed+0x74/0x79
 [<c054a94b>] xenwatch_handle_callback+0x12/0x44
 [<c054b414>] xenwatch_thread+0x108/0x11e
 [<c054a939>] xenwatch_handle_callback+0x0/0x44
 [<c042e020>] autoremove_wake_function+0x0/0x35
 [<c054b30c>] xenwatch_thread+0x0/0x11e
 [<c042df67>] kthread+0xc0/0xed
 [<c042dea7>] kthread+0x0/0xed
 [<c0402a69>] kernel_thread_helper+0x5/0xb
 =======================

Comment 1 Brian Brock 2007-02-16 19:46:02 UTC
not immediately reproduced with rhel5-Server-20070208.0

Comment 2 Brian Brock 2007-02-16 22:06:44 UTC
Changing release to fedora, I'm not seeing this with rhel5 guests

Comment 3 Red Hat Bugzilla 2007-07-25 01:38:17 UTC
change QA contact

Comment 4 Chris Lalancette 2008-02-26 23:55:38 UTC
This report targets FC6, which is now end-of-life.

Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.

Thanks