Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 694750

Summary:	Qemu-kvm instance quitting very slow after ping-pong migration for long time
Product:	Red Hat Enterprise Linux 6	Reporter:	Mike Cao <bcao>
Component:	qemu-kvm	Assignee:	Rik van Riel <riel>
Status:	CLOSED WORKSFORME	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	6.1	CC:	bcao, gcosta, juzhang, michen, mkenneth, tburke, virt-maint
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	766534 (view as bug list)		Environment:
Last Closed:	2011-12-12 08:11:15 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	766534

Description Mike Cao 2011-04-08 09:23:17 UTC

Description of problem:


Version-Release number of selected component (if applicable):
2.6.32-128.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.156.el6.x86_64


How reproducible:
sometimes

Steps to Reproduce:
1.Start guest in src host
2.start listenning port
3.do ping-pong live migration ,once migration complete, quit the qemu-kvm process in src host
  
Actual results:
after migration ,qemu-monitor slow to respond 
issue (qemu) quit , guest quiting very slow

Expected results:
after migration ,qemu-monitor still works as normal ,
after issue "quit" in qemu-monitor ,qemu-kvm process can quit fast

Additional info:

Comment 2 RHEL Program Management 2011-04-09 06:00:19 UTC

Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Rik van Riel 2011-06-08 18:55:23 UTC

Here is the conversation that got me the bug.

Information wanted: how can I reproduce the bug?  What commands do I need to type and where? :)

--- byount is now known as byount|kcs
<quintela> https://bugzilla.redhat.com/show_bug.cgi?id=694750
<supybot> Bug 694750: medium, medium, rc, quintela, NEW, Qemu-kvm instance quitting very slow after ping-pong migration for long time
<quintela> I want to give this to you/andrea
<riel> strange bug
--> ddd (~ddutile.redhat.com) has joined #kvm
<riel> do you suspect the kernel is involved at all?  or just qemu-kvm weirdness?
--- twoerner is now known as twoerner_gone
<riel> How reproducible:
<riel> sometimes
<riel> and probably no customer impact :)
<quintela> riel: it is always reproducible for me
<quintela> if you migrate, source of migration gets unresponsive
<quintela> it is kernel issue
<riel> ohhhh
<quintela> whole host
<riel> so it takes just a single migration?
<quintela> my understanding is that we have not "undo" the migration log, and then we do it at a very bad way
<riel> for the entire source host to become unresponsive?
<quintela> but I don't know how to do that
<quintela> riel: yeap
<riel> interesting
<quintela> riel: you need a big guest (better 8/16GB guest)
<quintela> i.e. we have something exponential somewhere
<riel> and two big hosts in the lab
<quintela> for me, it happens that I do a migration, and then exit qemu on the source
<quintela> and it takes 10-15 seconds
<quintela> for that time, no remote shell answers
<riel> woah
<riel> not even in other ssh sessions to the host?
<quintela> riel: yeap, not even that
<quintela> and QE was able to reproduce, so it is not my imagination O:-)
--> Sanjay_M|commute (~smehrotr.redhat.com) has joined #kvm
--- Sanjay_M|commute is now known as Sanjay_M
<-- ulio has quit (Quit: Leaving)
<riel> quintela: sure I'll take the bug
<riel> quintela: do we have hosts set up somewhere that reproduce it?  (can you teach me how to reproduce it?)
<riel> I wonder what qemu does to make migrate work - lots of mprotect calls?
<quintela> riel: kernel assistance
<quintela> we ask the kernel to mark what pages have changed
<quintela> and we reload a bitmap with the changed pages each time that we sent everything dirty on the bitmap
<quintela> userspace does nothing more than reading that bitmap, and sending out the dirty ones
--> simong (~simong.redhat.com) has joined #kvm
<riel> quintela: where is the code that sets up and maintains that bitmap?  (and yeah, you can assign the bug to me)

Comment 5 Rik van Riel 2011-10-05 14:11:43 UTC

Juan, I have not reproduced the bug on the lab setup.  I have managed to make a 15GB guest migrate back and forth in a tight loop between two hosts, and it's always died after a while, without me observing the bug.

Comment 7 Dor Laor 2011-12-12 08:11:15 UTC

Closing, if QE can propose a script that helps reproduce it would be re-opened