Bug 1007427 - vm disappeared after reboot
vm disappeared after reboot
Status: CLOSED CURRENTRELEASE
Product: oVirt
Classification: Community
Component: ovirt-engine-core (Show other bugs)
3.3
x86_64 Linux
urgent Severity urgent
: ---
: 3.3.1
Assigned To: Roy Golan
infra
:
: 1007674 (view as bug list)
Depends On:
Blocks: 918494
  Show dependency treegraph
 
Reported: 2013-09-12 09:20 EDT by Hans-Joachim
Modified: 2013-11-25 06:49 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-25 06:49:25 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 17582 None None None Never
oVirt gerrit 18959 None None None Never

  None (edit)
Description Hans-Joachim 2013-09-12 09:20:14 EDT
Description of problem:
After a reboot of both Engine + Host, all VMs disappered (even in the database)

Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.3.0-2.el6 vdsm-4.12.1-2.el6
Storage Domain: iSCSI

How reproducible:
so far each time....

Steps to Reproduce:
1. Import VMs from EXPORT Domain
2. Powerdown Host
3. Powerdown Engine
4. Powerup Engine


Actual results:
all VMs definitions are lost

Expected results:
VMs definitions should 'survive' a reboot 

Additional info:
Disks are not affected

This happens even if the Host is up. In this case, there is a message:
Failed to import Vm VM1 to Data Center Default, Cluster Default
Comment 1 James Wilson 2013-09-13 06:21:02 EDT
I can concur.  This exact thing is happening to me.  I'm trying to import oVirt 3.2 VM's from an export domain.  The import is successful, to the point I can fire up the VM's.

Restart ovirt-engine and the VM is gone.  Prior to the latest ovirt-engine update, only the VM container was removed and the disk remained.  Since updating to ovirt-engine-3.3.0-3.el6.noarch a few moments ago, the container still gets removed, but now the disk is gone too!

CentOS 6.4 x86_64 / oVirt 3.3

Steps to Reproduce:

1) Import VM from an export domain into the cluster. 
2) Wait for Import successful message
3) View the machine in the VM overview of the cluster
(Machine can even be started successfully)
4) Stop and start ovirt-engine
5) Machine(s) are missing
6) Cluster displays failure message

On restarting ovirt-engine, the following is logged:

2013-09-13 11:17:51,886 INFO  [org.ovirt.engine.core.bll.ImportVmCommand] (pool-6-thread-3) Lock freed to object EngineLock [exclusiveLocks= key: lb.test.lan value: VM_NAME
, sharedLocks= key: 18f3477b-0b65-4255-a56c-379f2c1b326a value: REMOTE_VM
]
2013-09-13 11:17:52,065 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-6-thread-3) Correlation ID: 32298b0a, Call Stack: null, Custom Event ID: -1, Message: Failed to import Vm lb.test.lan to Data Center local_datacenter, Cluster local_cluster
2013-09-13 1
Comment 2 James Wilson 2013-09-13 06:39:33 EDT
More information on a duplicate bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1007674
Comment 3 Yair Zaslavsky 2013-09-13 06:45:50 EDT
Already fixed.
I guess did not make it to 3.3.0-2.el6

Explanation -

We need to revisit transaction management in some of our commands.

What happened is that due to transactivity issue - for import vm command async tasks were created with vdsm_task_id which is the empty guid - there is a part in the code that puts the transaction in suspend, then there is a query to get the async task - since it was not commited, the query returns 0, and a new async task is inserted.

After commands were finished successfully, these left overs remained in db.

The user restarted engine. Async Task manager detected that there are tasks for import vm command. Since the vdsm task id is 0 (which is bad) it ran the end with failure treatment , which in turn erased the imported vms from db (as it should for "proper failures").
Comment 4 Yair Zaslavsky 2013-09-13 06:47:46 EDT
*** Bug 1007674 has been marked as a duplicate of this bug. ***
Comment 5 James Wilson 2013-09-13 07:01:38 EDT
Looking at http://gerrit.ovirt.org/#/c/17582/ - I imported a VM, manually modified the vdsmTaskIds entry, which was set to null.  Restarted ovirt-engine, and the VM remains.

This is also successful, if post import and before restarting overt-engine, the task is deleted completely from the async_tasks table.

The VM remains.

Will this patch make it into Beta?
Comment 6 Itamar Heim 2013-09-15 03:42:01 EDT
GA is due. this should make GA.
yair - was it cloned to 3.3 branch?
Ofer - fyi
Comment 7 Yair Zaslavsky 2013-09-15 03:44:17 EDT
Itamar, 
This was cloned to the 3.3 branch -

http://gerrit.ovirt.org/#/c/18959/
Comment 8 Sandro Bonazzola 2013-11-25 06:49:25 EST
oVirt 3.3.1 has been released

Note You need to log in before you can comment on or make changes to this bug.