Bug 435351 - [RHEL4.7]: PV kernel can OOPs during live migrate
[RHEL4.7]: PV kernel can OOPs during live migrate
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel-xen (Show other bugs)
All Linux
medium Severity medium
: rc
: ---
Assigned To: Chris Lalancette
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2008-02-28 15:05 EST by Chris Lalancette
Modified: 2008-07-24 15:27 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-07-24 15:27:07 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Fix for the crash mentioned in this bug (3.65 KB, patch)
2008-02-28 15:34 EST, Chris Lalancette
no flags Details | Diff

  None (edit)
Description Chris Lalancette 2008-02-28 15:05:07 EST
Description of problem:
When attempting to live migrate a RHEL-4.7 PV kernel, you can run into the
following OOPS:

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dev:3027
invalid operand: 0000 [1] SMP 
CPU 0 
Modules linked in: md5 ipv6 autofs4 sunrpc loop xennet dm_snapshot dm_zero dm_mi
rror ext3 jbd dm_mod xenblk sd_mod scsi_mod
Pid: 7, comm: xenwatch Not tainted 2.6.9-68.15.ELxenU
RIP: e030:[<ffffffff8023e843>] <ffffffff8023e843>{free_netdev+30}
RSP: e02b:ffffff8000b97da0  EFLAGS: 00010293
RAX: 0000000000000002 RBX: ffffff801e2d8380 RCX: 00000000000017af
RDX: 00000000000017af RSI: 0000000000000000 RDI: ffffff801e2d8000
RBP: ffffff8001aeae00 R08: ffffff801fe76a08 R09: ffffff801e2d8380
R10: 0000000100000000 R11: 0000000000000001 R12: ffffffff80353100
R13: 00000000fffffffc R14: ffffff8000021d78 R15: ffffffff80144c5c
FS:  0000002a95577880(0000) GS:ffffffff80420a80(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process xenwatch (pid: 7, threadinfo ffffff8000b96000, task ffffff8000b6a7f0)
Stack: ffffffffa00989bb ffffffffa009cd28 ffffffff802240e3 ffffffffa009cd28 
       ffffff8001aeae48 ffffffffa009cd28 ffffffff8020eb94 ffffffffff5fd000 
       ffffffff803531a0 ffffff8001aeae48 
Call Trace:<ffffffffa00989bb>{:xennet:netfront_remove+25} <ffffffff802240e3>{xen
       <ffffffff8020eb94>{device_release_driver+83} <ffffffff8020ed58>{bus_remov
       <ffffffff8020dfd2>{device_del+104} <ffffffff8020dff8>{device_unregister+9
       <ffffffff802247dd>{dev_changed+149} <ffffffff80144c5c>{keventd_create_kth
       <ffffffff802235c2>{xenwatch_handle_callback+21} <ffffffff8022375b>{xenwat
       <ffffffff8012daa0>{autoremove_wake_function+0} <ffffffff8012daa0>{autorem
       <ffffffff802235f5>{xenwatch_thread+0} <ffffffff80144c33>{kthread+200} 
       <ffffffff8010e056>{child_rip+8} <ffffffff80144c5c>{keventd_create_kthread
       <ffffffff80144b6b>{kthread+0} <ffffffff8010e04e>{child_rip+0} 

What's interesting is that it doesn't happen all of the time.  The right
situation seems to be when you have very little hypervisor memory left (i.e. xm
info -> free_memory is very low), and you attempt the live migrate. 
Interestingly enough, I believe this bug was already fixed in xen-3.1-testing.hg
changeset 13100 by Glauber a long time ago; the fix is in RHEL-5, we just must
have missed it for RHEL-4.
Comment 1 Chris Lalancette 2008-02-28 15:34:08 EST
Created attachment 296256 [details]
Fix for the crash mentioned in this bug

This is the fix for the crash in this bug.  This is upstream xen-3.1-testing
c/s 13100, massaged to apply to RHEL-4.
Comment 4 Vivek Goyal 2008-03-20 10:10:01 EDT
Committed in 68.24.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 7 errata-xmlrpc 2008-07-24 15:27:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.