Bug 694750 - Qemu-kvm instance quitting very slow after ping-pong migration for long time
Summary: Qemu-kvm instance quitting very slow after ping-pong migration for long time
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Rik van Riel
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 766534
TreeView+ depends on / blocked
 
Reported: 2011-04-08 09:23 UTC by Mike Cao
Modified: 2013-01-09 23:46 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 766534 (view as bug list)
Environment:
Last Closed: 2011-12-12 08:11:15 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Mike Cao 2011-04-08 09:23:17 UTC
Description of problem:


Version-Release number of selected component (if applicable):
2.6.32-128.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.156.el6.x86_64


How reproducible:
sometimes

Steps to Reproduce:
1.Start guest in src host
2.start listenning port
3.do ping-pong live migration ,once migration complete, quit the qemu-kvm process in src host
  
Actual results:
after migration ,qemu-monitor slow to respond 
issue (qemu) quit , guest quiting very slow

Expected results:
after migration ,qemu-monitor still works as normal ,
after issue "quit" in qemu-monitor ,qemu-kvm process can quit fast

Additional info:

Comment 2 RHEL Program Management 2011-04-09 06:00:19 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Rik van Riel 2011-06-08 18:55:23 UTC
Here is the conversation that got me the bug.

Information wanted: how can I reproduce the bug?  What commands do I need to type and where? :)

--- byount is now known as byount|kcs
<quintela> https://bugzilla.redhat.com/show_bug.cgi?id=694750
<supybot> Bug 694750: medium, medium, rc, quintela, NEW, Qemu-kvm instance quitting very slow after ping-pong migration for long time
<quintela> I want to give this to you/andrea
<riel> strange bug
--> ddd (~ddutile.redhat.com) has joined #kvm
<riel> do you suspect the kernel is involved at all?  or just qemu-kvm weirdness?
--- twoerner is now known as twoerner_gone
<riel> How reproducible:
<riel> sometimes
<riel> and probably no customer impact :)
<quintela> riel: it is always reproducible for me
<quintela> if you migrate, source of migration gets unresponsive
<quintela> it is kernel issue
<riel> ohhhh
<quintela> whole host
<riel> so it takes just a single migration?
<quintela> my understanding is that we have not "undo" the migration log, and then we do it at a very bad way
<riel> for the entire source host to become unresponsive?
<quintela> but I don't know how to do that
<quintela> riel: yeap
<riel> interesting
<quintela> riel: you need a big guest (better 8/16GB guest)
<quintela> i.e. we have something exponential somewhere
<riel> and two big hosts in the lab
<quintela> for me, it happens that I do a migration, and then exit qemu on the source
<quintela> and it takes 10-15 seconds
<quintela> for that time, no remote shell answers
<riel> woah
<riel> not even in other ssh sessions to the host?
<quintela> riel: yeap, not even that
<quintela> and QE was able to reproduce, so it is not my imagination O:-)
--> Sanjay_M|commute (~smehrotr.redhat.com) has joined #kvm
--- Sanjay_M|commute is now known as Sanjay_M
<-- ulio has quit (Quit: Leaving)
<riel> quintela: sure I'll take the bug
<riel> quintela: do we have hosts set up somewhere that reproduce it?  (can you teach me how to reproduce it?)
<riel> I wonder what qemu does to make migrate work - lots of mprotect calls?
<quintela> riel: kernel assistance
<quintela> we ask the kernel to mark what pages have changed
<quintela> and we reload a bitmap with the changed pages each time that we sent everything dirty on the bitmap
<quintela> userspace does nothing more than reading that bitmap, and sending out the dirty ones
--> simong (~simong.redhat.com) has joined #kvm
<riel> quintela: where is the code that sets up and maintains that bitmap?  (and yeah, you can assign the bug to me)

Comment 5 Rik van Riel 2011-10-05 14:11:43 UTC
Juan, I have not reproduced the bug on the lab setup.  I have managed to make a 15GB guest migrate back and forth in a tight loop between two hosts, and it's always died after a while, without me observing the bug.

Comment 7 Dor Laor 2011-12-12 08:11:15 UTC
Closing, if QE can propose a script that helps reproduce it would be re-opened


Note You need to log in before you can comment on or make changes to this bug.