Hide Forgot
Description of problem: Version-Release number of selected component (if applicable): 2.6.32-128.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.156.el6.x86_64 How reproducible: sometimes Steps to Reproduce: 1.Start guest in src host 2.start listenning port 3.do ping-pong live migration ,once migration complete, quit the qemu-kvm process in src host Actual results: after migration ,qemu-monitor slow to respond issue (qemu) quit , guest quiting very slow Expected results: after migration ,qemu-monitor still works as normal , after issue "quit" in qemu-monitor ,qemu-kvm process can quit fast Additional info:
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Here is the conversation that got me the bug. Information wanted: how can I reproduce the bug? What commands do I need to type and where? :) --- byount is now known as byount|kcs <quintela> https://bugzilla.redhat.com/show_bug.cgi?id=694750 <supybot> Bug 694750: medium, medium, rc, quintela, NEW, Qemu-kvm instance quitting very slow after ping-pong migration for long time <quintela> I want to give this to you/andrea <riel> strange bug --> ddd (~ddutile.redhat.com) has joined #kvm <riel> do you suspect the kernel is involved at all? or just qemu-kvm weirdness? --- twoerner is now known as twoerner_gone <riel> How reproducible: <riel> sometimes <riel> and probably no customer impact :) <quintela> riel: it is always reproducible for me <quintela> if you migrate, source of migration gets unresponsive <quintela> it is kernel issue <riel> ohhhh <quintela> whole host <riel> so it takes just a single migration? <quintela> my understanding is that we have not "undo" the migration log, and then we do it at a very bad way <riel> for the entire source host to become unresponsive? <quintela> but I don't know how to do that <quintela> riel: yeap <riel> interesting <quintela> riel: you need a big guest (better 8/16GB guest) <quintela> i.e. we have something exponential somewhere <riel> and two big hosts in the lab <quintela> for me, it happens that I do a migration, and then exit qemu on the source <quintela> and it takes 10-15 seconds <quintela> for that time, no remote shell answers <riel> woah <riel> not even in other ssh sessions to the host? <quintela> riel: yeap, not even that <quintela> and QE was able to reproduce, so it is not my imagination O:-) --> Sanjay_M|commute (~smehrotr.redhat.com) has joined #kvm --- Sanjay_M|commute is now known as Sanjay_M <-- ulio has quit (Quit: Leaving) <riel> quintela: sure I'll take the bug <riel> quintela: do we have hosts set up somewhere that reproduce it? (can you teach me how to reproduce it?) <riel> I wonder what qemu does to make migrate work - lots of mprotect calls? <quintela> riel: kernel assistance <quintela> we ask the kernel to mark what pages have changed <quintela> and we reload a bitmap with the changed pages each time that we sent everything dirty on the bitmap <quintela> userspace does nothing more than reading that bitmap, and sending out the dirty ones --> simong (~simong.redhat.com) has joined #kvm <riel> quintela: where is the code that sets up and maintains that bitmap? (and yeah, you can assign the bug to me)
Juan, I have not reproduced the bug on the lab setup. I have managed to make a 15GB guest migrate back and forth in a tight loop between two hosts, and it's always died after a while, without me observing the bug.
Closing, if QE can propose a script that helps reproduce it would be re-opened