Bug 1330548
Summary: | VMs failed to migrate when one of the node in the cluster is put into maintenance. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | RamaKasturi <knarra> |
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.2 | CC: | bugs, dyuan, jsuchane, msivak, pzhang, rbalakri, sabose, xuzhang, ykaul |
Target Milestone: | pre-dev-freeze | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-09-12 13:49:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1258386 |
Description
RamaKasturi
2016-04-26 12:34:37 UTC
From the logs on migration destination: periodic/47::WARNING::2016-04-26 15:00:56,440::periodic::285::virt.vm::(__call__) vmId=`2050bdfa-caea-49a3-bad4-f49964b29657`::could not run on 2050bdfa-caea-49a3-bad4-f49964b29657: domain n ot connected libvirtEventLoop::WARNING::2016-04-26 15:00:58,783::utils::140::root::(rmFile) File: /var/lib/libvirt/qemu/channels/2050bdfa-caea-49a3-bad4-f49964b29657.com.redhat.rhevm.vdsm already removed jsonrpc.Executor/4::DEBUG::2016-04-26 15:00:58,773::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.destroy' in bridge with [u'2050bdfa-caea-49a3-bad4-f49964b29657'] jsonrpc.Executor/4::INFO::2016-04-26 15:00:58,788::API::341::vds::(destroy) vmContainerLock acquired by vm 2050bdfa-caea-49a3-bad4-f49964b29657 jsonrpc.Executor/4::DEBUG::2016-04-26 15:00:58,789::vm::3885::virt.vm::(destroy) vmId=`2050bdfa-caea-49a3-bad4-f49964b29657`::destroy Called Thread-3732940::ERROR::2016-04-26 15:00:58,774::vm::753::virt.vm::(_startUnderlyingVm) vmId=`2050bdfa-caea-49a3-bad4-f49964b29657`::Failed to start a migration destination vm Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 722, in _startUnderlyingVm self._completeIncomingMigration() File "/usr/share/vdsm/virt/vm.py", line 2852, in _completeIncomingMigration self._incomingMigrationFinished.isSet(), usedTimeout) File "/usr/share/vdsm/virt/vm.py", line 2911, in _attachLibvirtDomainAfterMigration raise MigrationError(e.get_error_message()) MigrationError: Domain not found: no domain with matching uuid '2050bdfa-caea-49a3-bad4-f49964b29657' Thread-3732940::INFO::2016-04-26 15:00:58,793::vm::1330::virt.vm::(setDownStatus) vmId=`2050bdfa-caea-49a3-bad4-f49964b29657`::Changed state to Down: VM failed to migrate (code=8) Thread-3732940::DEBUG::2016-04-26 15:00:58,795::__init__::206::jsonrpc.Notification::(emit) Sending event {"params": {"2050bdfa-caea-49a3-bad4-f49964b29657": {"status": "Down", "timeOffset": "0", "exitReason": 8, "exitMessage": "VM failed to migrate", "exitCode": 1}, "notify_time": 5589338680}, "jsonrpc": "2.0", "method": "|virt|VM_status|2050bdfa-caea-49a3-bad4-f49964b29657"} The sosreport tree output does have the image path - /rhev/data-center/mnt/glusterSD/sulphur.lab.eng.blr.redhat.com:_vmstore/297a9b9c-4396-4b30-8bfe-976a67d49a74/images/2c6601e9-456f-4638-9ce4-d98efd97c053/86999bdc-a7bd-4d1e-9faa-a8ba7cf531f4 Martin, any ideas about this error? I'm afraid I don't have enough virt expertise to debug further. It seems the engine actually tried to migrate the VM so it is not a scheduling issue. And I do not know enough about the underlying libvirt logic, you need someone from the virt team for that (I do scheduling and QoS). this looks like a libvirt or qemu bug, versions look the same on both sides, (see e.g. qemu/linux_vm.log) moving to libvirt, feel free to push down the stack The "internal error: info migration reply was missing return status" is a result of bug 1374613. Because of this bug and missing debug logs from libvirt it's impossible to diagnose the real cause of the migration failure. I'm closing this bug as a duplicate of 1374613. If the issue can be reproduced with a package with bug 1374613 fixed, please, file a new bug so that we can properly investigate the root cause. *** This bug has been marked as a duplicate of bug 1374613 *** |