Hide Forgot
Description of problem: Eventually migration of VM stuck in "Migrating From" or "Migrating To" state. In 1st case migrating VM continues to run on source host. In 2nd case VM actually migrates to destination host (verified by vdsClient -s 0 list table). State of migrating VM keeps until I restart ovirt-engine. Version-Release number of selected component (if applicable): Hosted engine setup, 2 hosts. Hosts software: OS Version: RHEL - 7 - 2.1511.el7.centos.2.10 OS Description: CentOS Linux 7 (Core) Kernel Version: 3.10.0 - 327.28.3.el7.x86_64 KVM Version: 2.3.0 - 31.el7.16.1 LIBVIRT Version: libvirt-1.2.17-13.el7_2.5 VDSM Version: vdsm-4.18.11-1.el7.centos SPICE Version: 0.12.4 - 15.el7_2.1 GlusterFS Version: glusterfs-3.7.15-1.el7 CEPH Version: librbd1-0.80.7-3.el7 Hosted engine: ovirt-engine-4.0.3-1.el7.centos.noarch Initial setup used ovirt 3.5, than upgraded to 3.6, 4.0, 4.0.3. How reproducible: Always Steps to Reproduce: 1. Migrate VM from one host to another and back several times. 2. Eventually VM stuck in "Migrating ..." state. 3. Restart ovirt-engine service. 4. Ensure VM state is UP. Expected results: VM should migrate to another host and status should be updated accordingly. Additional info:
can you please attach vdsm.logs from source and destination and some description of a specific attempt so it can be identified? Thanks!
Hi Michal! It is strange, but I can not reproduce my issue today. I observed it for 2 days before. Today I tried to migrate my VM from one host to another several times, one by one and simultaneously and migrations was successful. Eventually I tried to switch one of my host (icgs-hv2), running all VMs, to maintenance. VMs migrates to another host (icgs-hv1) successfuly, but host stuck in "Preparing for maintenance" state. Task "Moving host to maintenance" has not been completed. Several new tasks "Moving host to maintenance" appeared in task list (please see screenshot attached), but I don't started those tasks. In original task details there is few "Moving VM" subtasks in progress (see another screenshot). Original task was started at 11:01:17. I think my issue is related to ovirt-engine or DB, not to vdsm.
Created attachment 1203632 [details] Screenshot of running tasks
Created attachment 1203633 [details] Sreenshot of running task detail
Created attachment 1203634 [details] vdsm log from host icgs-hv1 (destination)
Created attachment 1203635 [details] vdsm log from host icgs-hv2 (source)
Created attachment 1203636 [details] engine log
Hi! Task "Moving host to maintenance" has been completed after destroying VMs on icgs-hv2 host by vdsClient. Before destroying this VMs was in down state.
Hi, I see 2 strange things in the logs: 1: you have an old guest agent (that should not have any influence on the migration, but you see this "Invalid or unknown guest architecture type 'i686' received from guest agent" caused by https://bugzilla.redhat.com/show_bug.cgi?id=1332723) 2: all migrations end with: 2016-09-22 11:01:53,441 WARN [org.ovirt.engine.core.bll.ChangeVMClusterCommand] (DefaultQuartzScheduler7) [8a727a8] Validation of action 'ChangeVMCluster' failed for user SYSTEM. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__VM__CLUSTER,VM_CLUSTER_IS_NOT_VALID which seems more suspicious. So, some questions: - are you migrating your VMs to a different cluster when this happens? - by any chance does this happen only when you do a migration using the REST API or normally using the webadmin?
Hi! No, I have only one cluster in this setup. I tryed migration only using webadmin. Unfortunately I rebuilt this setup from scratch and can not reproduce this issue more.
Well, I guess that's good news. I've only seen another minor issue in logs about migration stats monitoring fixed in 4.0.4, so not relevant any more either Please feel free to reopen if you see it again