Bug 518032
Summary: | Restoring a qemu guest from a saved state file using -incoming sometimes fails and hangs | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Daniel Berrangé <berrange> | ||||
Component: | qemu | Assignee: | Juan Quintela <quintela> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 13 | CC: | berrange, dwmw2, gcosta, itamar, jaswinder, markmc, maurizio.antillon, quintela, rbinkhor, virt-maint | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-0.12.3-5.fc13 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 570174 (view as bug list) | Environment: | |||||
Last Closed: | 2011-06-27 14:20:52 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 498969, 570174 | ||||||
Attachments: |
|
Description
Daniel Berrangé
2009-08-18 14:40:45 UTC
This may or may not be TCG specific. I've not got spare hardware available to reproduce with F12 KVM. Juan's patches looks like they might help the "we should exit on error" problem: http://lists.gnu.org/archive/html/qemu-devel/2009-08/msg00876.html Taking a look at this. AFAICT, Juan's patches are for a different bit of code - savevm. The libvirt restore process using incoming migrate. I've since found the error message in guest logs 'load of migration failed' whcih points to this code in migration-exec.c: static void exec_accept_incoming_migration(void *opaque) { QEMUFile *f = opaque; int ret; ret = qemu_loadvm_state(f); if (ret < 0) { fprintf(stderr, "load of migration failed\n"); goto err; } qemu_announce_self(); dprintf("successfully loaded vm state\n"); /* we've successfully migrated, close the fd */ qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, NULL, NULL, NULL); if (autostart) vm_start(); err: qemu_fclose(f); } This method just prints an error message upon load failure, and then carries on letting QEMU run as if all was well. In fact QEMU is pretty much trashed at this point, spinning in a 100% CPU loop, not responding to any monitor commands at all. It needs to abort/exit if it can't load an incoming migration The TCP migration code suffers from the same problem of just printing error & pretending all is well. I really need this fixed in F11/12/rawhide asap because it causes the libvirt test suite to hang, when we try to test our handling of broken migration/restore data. FYI, you can reproduce this problem very easily 1. virsh save GUEST guest.saved 2. truncate --size 50MB guest.saved 3. virsh restore guest.saved ie, just destroy the end of the guest saved state file, without touching the initial libvirt XML header. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. Created attachment 398965 [details]
Clear fd on migration errors
Attached patch fixes the problem. Already sent upstream or review. qemu-0.12.3-1.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/qemu-0.12.3-1.fc12 qemu-0.12.3-5.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/qemu-0.12.3-5.fc13 qemu-0.12.3-5.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report. qemu-0.12.3-2.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/qemu-0.12.3-2.fc12 The provided fix does not ensure that QEMU exits. It merely stops it using 100% cpu. QEMU needs to exit since there is no other way to detect that the incoming migration has failed. This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle. Changing version to '13'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping qemu-0.12.3-2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update qemu'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/qemu-0.12.3-2.fc12 qemu-0.12.3-2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update qemu'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/qemu-0.12.3-2.fc12 qemu-0.12.3-4.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/qemu-0.12.3-4.fc12 qemu-0.12.3-4.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update qemu'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/qemu-0.12.3-4.fc12 This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |