Red Hat Bugzilla – Bug 1007427
vm disappeared after reboot
Last modified: 2013-11-25 06:49:25 EST
Description of problem:
After a reboot of both Engine + Host, all VMs disappered (even in the database)
Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.3.0-2.el6 vdsm-4.12.1-2.el6
Storage Domain: iSCSI
so far each time....
Steps to Reproduce:
1. Import VMs from EXPORT Domain
2. Powerdown Host
3. Powerdown Engine
4. Powerup Engine
all VMs definitions are lost
VMs definitions should 'survive' a reboot
Disks are not affected
This happens even if the Host is up. In this case, there is a message:
Failed to import Vm VM1 to Data Center Default, Cluster Default
I can concur. This exact thing is happening to me. I'm trying to import oVirt 3.2 VM's from an export domain. The import is successful, to the point I can fire up the VM's.
Restart ovirt-engine and the VM is gone. Prior to the latest ovirt-engine update, only the VM container was removed and the disk remained. Since updating to ovirt-engine-3.3.0-3.el6.noarch a few moments ago, the container still gets removed, but now the disk is gone too!
CentOS 6.4 x86_64 / oVirt 3.3
Steps to Reproduce:
1) Import VM from an export domain into the cluster.
2) Wait for Import successful message
3) View the machine in the VM overview of the cluster
(Machine can even be started successfully)
4) Stop and start ovirt-engine
5) Machine(s) are missing
6) Cluster displays failure message
On restarting ovirt-engine, the following is logged:
2013-09-13 11:17:51,886 INFO [org.ovirt.engine.core.bll.ImportVmCommand] (pool-6-thread-3) Lock freed to object EngineLock [exclusiveLocks= key: lb.test.lan value: VM_NAME
, sharedLocks= key: 18f3477b-0b65-4255-a56c-379f2c1b326a value: REMOTE_VM
2013-09-13 11:17:52,065 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-6-thread-3) Correlation ID: 32298b0a, Call Stack: null, Custom Event ID: -1, Message: Failed to import Vm lb.test.lan to Data Center local_datacenter, Cluster local_cluster
More information on a duplicate bug:
I guess did not make it to 3.3.0-2.el6
We need to revisit transaction management in some of our commands.
What happened is that due to transactivity issue - for import vm command async tasks were created with vdsm_task_id which is the empty guid - there is a part in the code that puts the transaction in suspend, then there is a query to get the async task - since it was not commited, the query returns 0, and a new async task is inserted.
After commands were finished successfully, these left overs remained in db.
The user restarted engine. Async Task manager detected that there are tasks for import vm command. Since the vdsm task id is 0 (which is bad) it ran the end with failure treatment , which in turn erased the imported vms from db (as it should for "proper failures").
*** Bug 1007674 has been marked as a duplicate of this bug. ***
Looking at http://gerrit.ovirt.org/#/c/17582/ - I imported a VM, manually modified the vdsmTaskIds entry, which was set to null. Restarted ovirt-engine, and the VM remains.
This is also successful, if post import and before restarting overt-engine, the task is deleted completely from the async_tasks table.
The VM remains.
Will this patch make it into Beta?
GA is due. this should make GA.
yair - was it cloned to 3.3 branch?
Ofer - fyi
This was cloned to the 3.3 branch -
oVirt 3.3.1 has been released