Bug 435351 - [RHEL4.7]: PV kernel can OOPs during live migrate
[RHEL4.7]: PV kernel can OOPs during live migrate
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel-xen (Show other bugs)
4.7
All Linux
medium Severity medium
: rc
: ---
Assigned To: Chris Lalancette
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-02-28 15:05 EST by Chris Lalancette
Modified: 2008-07-24 15:27 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-24 15:27:07 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fix for the crash mentioned in this bug (3.65 KB, patch)
2008-02-28 15:34 EST, Chris Lalancette
no flags Details | Diff

  None (edit)
Description Chris Lalancette 2008-02-28 15:05:07 EST
Description of problem:
When attempting to live migrate a RHEL-4.7 PV kernel, you can run into the
following OOPS:

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dev:3027
invalid operand: 0000 [1] SMP 
CPU 0 
Modules linked in: md5 ipv6 autofs4 sunrpc loop xennet dm_snapshot dm_zero dm_mi
rror ext3 jbd dm_mod xenblk sd_mod scsi_mod
Pid: 7, comm: xenwatch Not tainted 2.6.9-68.15.ELxenU
RIP: e030:[<ffffffff8023e843>] <ffffffff8023e843>{free_netdev+30}
RSP: e02b:ffffff8000b97da0  EFLAGS: 00010293
RAX: 0000000000000002 RBX: ffffff801e2d8380 RCX: 00000000000017af
RDX: 00000000000017af RSI: 0000000000000000 RDI: ffffff801e2d8000
RBP: ffffff8001aeae00 R08: ffffff801fe76a08 R09: ffffff801e2d8380
R10: 0000000100000000 R11: 0000000000000001 R12: ffffffff80353100
R13: 00000000fffffffc R14: ffffff8000021d78 R15: ffffffff80144c5c
FS:  0000002a95577880(0000) GS:ffffffff80420a80(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process xenwatch (pid: 7, threadinfo ffffff8000b96000, task ffffff8000b6a7f0)
Stack: ffffffffa00989bb ffffffffa009cd28 ffffffff802240e3 ffffffffa009cd28 
       ffffff8001aeae48 ffffffffa009cd28 ffffffff8020eb94 ffffffffff5fd000 
       ffffffff803531a0 ffffff8001aeae48 
Call Trace:<ffffffffa00989bb>{:xennet:netfront_remove+25} <ffffffff802240e3>{xen
bus_dev_remove+44} 
       <ffffffff8020eb94>{device_release_driver+83} <ffffffff8020ed58>{bus_remov
e_device+162} 
       <ffffffff8020dfd2>{device_del+104} <ffffffff8020dff8>{device_unregister+9
} 
       <ffffffff802247dd>{dev_changed+149} <ffffffff80144c5c>{keventd_create_kth
read+0} 
       <ffffffff802235c2>{xenwatch_handle_callback+21} <ffffffff8022375b>{xenwat
ch_thread+358} 
       <ffffffff8012daa0>{autoremove_wake_function+0} <ffffffff8012daa0>{autorem
ove_wake_function+0} 
       <ffffffff802235f5>{xenwatch_thread+0} <ffffffff80144c33>{kthread+200} 
       <ffffffff8010e056>{child_rip+8} <ffffffff80144c5c>{keventd_create_kthread
+0} 
       <ffffffff80144b6b>{kthread+0} <ffffffff8010e04e>{child_rip+0} 

What's interesting is that it doesn't happen all of the time.  The right
situation seems to be when you have very little hypervisor memory left (i.e. xm
info -> free_memory is very low), and you attempt the live migrate. 
Interestingly enough, I believe this bug was already fixed in xen-3.1-testing.hg
changeset 13100 by Glauber a long time ago; the fix is in RHEL-5, we just must
have missed it for RHEL-4.
Comment 1 Chris Lalancette 2008-02-28 15:34:08 EST
Created attachment 296256 [details]
Fix for the crash mentioned in this bug

This is the fix for the crash in this bug.  This is upstream xen-3.1-testing
c/s 13100, massaged to apply to RHEL-4.
Comment 4 Vivek Goyal 2008-03-20 10:10:01 EDT
Committed in 68.24.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 7 errata-xmlrpc 2008-07-24 15:27:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html

Note You need to log in before you can comment on or make changes to this bug.