Bug 475827 - First live migration of RHEL4.6 para guest between RHEL5.3 panics guest
Summary: First live migration of RHEL4.6 para guest between RHEL5.3 panics guest
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Xen Maintainance List
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-10 17:53 UTC by Jan Tluka
Modified: 2008-12-10 18:05 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-12-10 18:05:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jan Tluka 2008-12-10 17:53:06 UTC
Description of problem:
While reproducing bug 458934 I've hit panic in paravirtualized guest.
It seems that panic occur only on first migration (after fresh install of virtualization tools and kernel), almost all of further migrations complete successfuly.

Two dom0's use shared NFS storage (/var/lib/xen/images) for live migration. See 'Steps to Reproduce' for further details.

Version-Release number of selected component (if applicable):
kernel-xen-2.6.18-125.el5 (RHEL5.3-Server-20081204.0) dom0 
2.6.9-67.ELxenU (RHEL4.6) para guest

How reproducible:
100% on first migration in dom0. Later on it's less reproducable.

Steps to Reproduce:
1. install two RHEL5.3-Server-20081204.0 dom0
2. create RHEL4.6 paravirtualized guest with name rhel4-6 and create following script with name doit.sh and make it executable in guest:
#!/bin/bash

while true; do
	touch /var/tmp/$$.log
	echo `hostname` >>  /var/tmp/$$.log
	echo `date`     >>  /var/tmp/$$.log
	cat  /var/tmp/$$.log
	df /var/tmp
	ls -l  /var/tmp/$$.log
	sleep 3
done

3. export /var/lib/xen/images/ directory via NFS on host1
4. mount exported share in host2
5. configure host1 and host2 for live migration (relocation entries in /etc/xend-config.sxp)
6. xm create rhel4-6 on host1
7. xm console rhel4-6 on host1
8. start pinging guest's IP from another host (have not tried to ping from dom0's)
9. launch doit.sh script in guest's console
10. start live migration from host1 to host2 using: xm migrate --live rhel4-6 host2 (on host1)

guest panics with following oops on console:

WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dev:3027
invalid operand: 0000 [1] SMP 
CPU 0 
Modules linked in: md5 ipv6 autofs4 sunrpc loop xennet dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod xenblk sd_mod scsi_mod
Pid: 7, comm: xenwatch Not tainted 2.6.9-67.ELxenU
RIP: e030:[<ffffffff8023e4cf>] <ffffffff8023e4cf>{free_netdev+30}
RSP: e02b:ffffff8001ac7da0  EFLAGS: 00010293
RAX: 0000000000000002 RBX: ffffff8018278380 RCX: 000000000000113a
RDX: 000000000000113a RSI: 0000000000000000 RDI: ffffff8018278000
RBP: ffffff800088de00 R08: ffffff8000bc4a08 R09: ffffff8018278380
R10: 0000000100000000 R11: 0000000000000001 R12: ffffffff80352720
R13: 00000000fffffffc R14: ffffff8001861d78 R15: ffffffff80144c28
FS:  0000002a95576560(0000) GS:ffffffff80420000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process xenwatch (pid: 7, threadinfo ffffff8001ac6000, task ffffff8001a9e7f0)
Stack: ffffffffa00989bb ffffffffa009ccc8 ffffffff80223dcb ffffffffa009ccc8 
       ffffff800088de48 ffffffffa009ccc8 ffffffff8020e878 ffffffffff5fd000 
       ffffffff803527c0 ffffff800088de48 
Call Trace:<ffffffffa00989bb>{:xennet:netfront_remove+25} <ffffffff80223dcb>{xenbus_dev_remove+44} 
       <ffffffff8020e878>{device_release_driver+83} <ffffffff8020ea3c>{bus_remove_device+162} 
       <ffffffff8020dcb6>{device_del+104} <ffffffff8020dcdc>{device_unregister+9} 
       <ffffffff802244c5>{dev_changed+149} <ffffffff80144c28>{keventd_create_kthread+0} 
       <ffffffff802232a6>{xenwatch_handle_callback+21} <ffffffff8022343f>{xenwatch_thread+358} 
       <ffffffff8012dab8>{autoremove_wake_function+0} <ffffffff8012dab8>{autoremove_wake_function+0} 
       <ffffffff802232d9>{xenwatch_thread+0} <ffffffff80144bff>{kthread+200} 
       <ffffffff8010e05a>{child_rip+8} <ffffffff80144c28>{keventd_create_kthread+0} 
       <ffffffff80144b37>{kthread+0} <ffffffff8010e052>{child_rip+0} 
       

Code: 0f 0b fe 83 2a 80 ff ff ff ff d3 0b c7 87 10 02 00 00 05 00 
RIP <ffffffff8023e4cf>{free_netdev+30} RSP <ffffff8001ac7da0>
 <0>Kernel panic - not syncing: Oops

  
Actual results:
Guest panics.

Expected results:
Guest does not panic.

Additional info:

Comment 1 Chris Lalancette 2008-12-10 18:05:18 UTC
I'm fairly certain that this was fixed by 4.7; see bug 435351.  Please test again with 4.7.  I'm going to close as CURRENTRELEASE for now; if 4.7 doesn't fix it, feel free to re-open.

Chris Lalancette


Note You need to log in before you can comment on or make changes to this bug.