Bug 870745 - ovirt-engine-backend [Task Manager]: vm migration in case of failure is not logged as its own task in the task manager
Summary: ovirt-engine-backend [Task Manager]: vm migration in case of failure is not l...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Nobody's working on this, feel free to take it
QA Contact:
URL:
Whiteboard: virt
Depends On: 818289 844344
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-28 12:31 UTC by Dafna Ron
Modified: 2014-01-01 08:08 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 818289
Environment:
Last Closed: 2013-05-12 07:14:49 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dafna Ron 2012-10-28 12:31:07 UTC
as discussed in bug 818289, I think that vm migration during failure is importent enough to have its own task and not be logged as a step under setting host as non-operational task 

===============================================================================
+++ This bug was initially created as a clone of Bug #818289 +++

Created attachment 581656 [details]
log

Description of problem:

if migration is initiated by the system we are not logging the migration in the task manager

Version-Release number of selected component (if applicable):

si2.1

How reproducible:

100%

Steps to Reproduce:
1. run vm on spm in two hosts cluster
2. block connectivity to storage from spm using iptables
3.
  
Actual results:

the vm's will migrate but migration will not be logged in the task manager


Expected results:

vm migration should appear in the task manager

Additional info: log

--- Additional comment from yzaslavs on 2012-05-14 07:17:12 EDT ---

VdsEventListener.vdsNonOperational
does not create a Job context for running the internal command.
As a result, the created step of migrate does not have an associated job.
This context should be propagated via MigrateAllVms to the step creation.

--- Additional comment from yzaslavs on 2012-05-14 07:20:35 EDT ---

In addition, the exception logging at

private ExecutionContext createMigrateVmContext(VM vm)


   } catch (RuntimeException e) {
            log.error(e);
        }

Should be improved

(MaintananceVdsCommand.java)

--- Additional comment from emesika on 2012-07-11 08:37:02 EDT ---

http://gerrit.ovirt.org/#/c/6148/

--- Additional comment from emesika on 2012-07-15 03:12:39 EDT ---

fixed in commit : 4d19328

--- Additional comment from dron on 2012-07-30 08:12:13 EDT ---

MigrateAllVms throws exception (bug: 844344). 

although vm migrates we see no task created in the task manager log. 




will attach full log

2012-07-30 14:10:03,451 ERROR [org.ovirt.engine.core.bll.MaintananceVdsCommand] (pool-4-thread-45) [1ef354a5] Failed to create ExecutionContext for MigrateVmCommand: java.lang.NullPointerException
        at org.ovirt.engine.core.bll.MaintananceVdsCommand.createMigrateVmContext(MaintananceVdsCommand.java:117) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.MaintananceVdsCommand.MigrateAllVms(MaintananceVdsCommand.java:96) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.MaintananceVdsCommand.MigrateAllVms(MaintananceVdsCommand.java:82) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.SetNonOperationalVdsCommand$1.run(SetNonOperationalVdsCommand.java:44) [engine-bll.jar:]
        at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:64) [engine-utils.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_05-icedtea]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_05-icedtea]
        at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_05-icedtea]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_05-icedtea]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [rt.jar:1.7.0_05-icedtea]
        at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_05-icedtea]



2012-07-30 15:10:09,317 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (pool-4-thread-45) [30642514] VdsBroker::migrate::Entered (vm_guid=fc52981c-8271-42dc-97ab-40aae2f6bd37, srcHost=orange-vdsd.qa.lab.tlv.re
dhat.com, dstHost=blond-vdsh.qa.lab.tlv.redhat.com:54321,  method=online
2012-07-30 15:10:09,317 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (pool-4-thread-45) [30642514] START, MigrateBrokerVDSCommand(vdsId = 8fdfe32c-d96e-11e1-a853-001a4a169741, vmId=fc52981c-8271-42dc-97ab-40
aae2f6bd37, srcHost=orange-vdsd.qa.lab.tlv.redhat.com, dstVdsId=85a0068a-d96e-11e1-af1e-001a4a169741, dstHost=blond-vdsh.qa.lab.tlv.redhat.com:54321, migrationMethod=ONLINE), log id: 3b8992c5
2012-07-30 15:10:09,341 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (pool-4-thread-45) [30642514] FINISH, MigrateBrokerVDSCommand, log id: 3b8992c5
2012-07-30 15:10:09,403 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (pool-4-thread-45) [30642514] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 53617cea
2012-07-30 15:10:11,073 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-59) [4cd67a82] hostFromVds::selectedVds - blond-vdsh, spmStatus Free, storage pool iSCSI
2012-07-30 15:10:11,104 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-59) [4cd67a82] SpmStatus on vds 8fdfe32c-d96e-11e1-a853-001a4a169741: Unknown_Pool
2012-07-30 15:10:11,119 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-59) [4cd67a82] starting spm on vds blond-vdsh, storage pool iSCSI, prevId 2, LVER 5
2012-07-30 15:10:11,120 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (QuartzScheduler_Worker-59) [4cd67a82] START, SpmStartVDSCommand(vdsId = 85a0068a-d96e-11e1-af1e-001a4a169741, storagePoolId = b421b191-d2ec-4c
b0-a8e5-751be6cbaa43, prevId=2, prevLVER=5, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 6ca8debe
2012-07-30 15:10:11,149 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (QuartzScheduler_Worker-59) [4cd67a82] spmStart polling started: taskId = d945be39-f653-44d4-b2ad-d7da566f259f
2012-07-30 15:10:11,729 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-76) [337526ca] vds::refreshVmList vm id fc52981c-8271-42dc-97ab-40aae2f6bd37 is migrating to vds blond-vdsh ignoring it in the 
refresh till migration is done
2012-07-30 15:10:11,730 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-76) [337526ca] vds::refreshVmList vm id ef15de06-95be-4556-90ac-1c292f649285 is migrating to vds blond-vdsh ignoring it in the 
refresh till migration is done
2012-07-30 15:10:13,800 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-44) [46fe0ef] vds::refreshVmList vm id fc52981c-8271-42dc-97ab-40aae2f6bd37 is migrating to vds blond-vdsh ignoring it in the r
efresh till migration is done
2012-07-30 15:10:13,801 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-44) [46fe0ef] vds::refreshVmList vm id ef15de06-95be-4556-90ac-1c292f649285 is migrating to vds blond-vdsh ignoring it in the r
efresh till migration is done
2012-07-30 15:10:15,871 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-70) [23bd2346] vds::refreshVmList vm id fc52981c-8271-42dc-97ab-40aae2f6bd37 is migrating to vds blond-vdsh ignoring it in the 
refresh till migration is done
2012-07-30 15:10:15,872 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-70) [23bd2346] vds::refreshVmList vm id ef15de06-95be-4556-90ac-1c292f649285 is migrating to vds blond-vdsh ignoring it in the 
refresh till migration is done

--- Additional comment from dron on 2012-07-30 08:12:41 EDT ---

Created attachment 601240 [details]
engine log

--- Additional comment from emesika on 2012-08-09 05:00:08 EDT ---

(In reply to comment #5)

Are you checking BZ 844344 scenario here?
Please check with the original scenario described in this bug

--- Additional comment from dron on 2012-08-12 03:22:37 EDT ---

(In reply to comment #7)
> (In reply to comment #5)
> 
> Are you checking BZ 844344 scenario here?
> Please check with the original scenario described in this bug

this is not the same bug. 
Haim added a second host which is connected to a different pool. 
this is a normal migration between hosts who are in the same pool - no connection.

--- Additional comment from emesika on 2012-08-27 06:13:38 EDT ---

http://gerrit.ovirt.org/#/c/7494/

--- Additional comment from emesika on 2012-08-28 04:08:19 EDT ---

fixed in commit: 7131e73

--- Additional comment from emesika on 2012-09-10 09:38:53 EDT ---

https://gerrit.eng.lab.tlv.redhat.com/#/c/1586/

--- Additional comment from yzaslavs on 2012-09-10 09:43:28 EDT ---

Downstream merge:

https://gerrit.eng.lab.tlv.redhat.com/gitweb?p=ovirt-engine.git;a=commit;h=8ccba4c2cce8f38aa5f5752939e6e1a5398a0e89

--- Additional comment from dron on 2012-09-19 08:28:14 EDT ---

Created attachment 614336 [details]
log and screen shot

tested in si18
not seeing the vm's migration in task manager

screen shot and log attached.

--- Additional comment from emesika on 2012-09-20 09:43:02 EDT ---

Created attachment 614920 [details]
Screenshot showing that reported scenario is fixed

Had checked reported bug scenario again and it is working correctly.
If you have found a diffrend scenario please add it to this bug.

--- Additional comment from dron on 2012-09-20 10:03:46 EDT ---

(In reply to comment #14)
> Created attachment 614920 [details]
> Screenshot showing that reported scenario is fixed
> 
> Had checked reported bug scenario again and it is working correctly.
> If you have found a diffrend scenario please add it to this bug.

I tested the reported scenario in si18.

--- Additional comment from emesika on 2012-09-20 10:49:54 EDT ---

(In reply to comment #15)
> (In reply to comment #14)
> > Created attachment 614920 [details]
> > Screenshot showing that reported scenario is fixed
> > 
> > Had checked reported bug scenario again and it is working correctly.
> > If you have found a diffrend scenario please add it to this bug.
> 
> I tested the reported scenario in si18.

Scenario was checked again on si18 verifing that vdsm/qemu/libvirt & engine are all as reported in si18
In addition I had created a new database from blank on si18 and a new NFS Master

I did not succeeded to reproduce, works OK for me.

I had consulted with Yair, since we have a dead-end, please recheck on si19

--- Additional comment from yzaslavs on 2012-10-04 02:19:12 EDT ---

In replay to comment #16 -
Can you please check this?

--- Additional comment from dron on 2012-10-14 04:40:51 EDT ---

tested again in si20 - will attach screen shots. 
the vm's migration is not shown in the task manager and are not shown in the job table: 

engine=# SELECT * from job;
                job_id                |         action_type         |                          description                          |  status  |               owner_id               | visible |         start_time         |          end_
time          |      last_update_time      | correlation_id 
--------------------------------------+-----------------------------+---------------------------------------------------------------+----------+--------------------------------------+---------+----------------------------+--------------
--------------+----------------------------+----------------
 0e7b46fb-ee28-4120-b1c1-080583e6fc0f | DeactivateStorageDomain     | Deactivating Storage Domain Dafna-03 in Data Center iSCSI     | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:41.414+02 | 2012-10-14 10
:28:05.715+02 | 2012-10-14 10:28:05.715+02 | 1884c24f
 7eadd2b2-5a23-449a-8f3e-ba68c34f28a0 | DeactivateStorageDomain     | Deactivating Storage Domain Dafna-02 in Data Center iSCSI     | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:40.022+02 | 2012-10-14 10
:28:10.1+02   | 2012-10-14 10:28:10.1+02   | 5ce925b5
 e505d352-6b3f-49fe-b2ee-424e92ad394f | RunVm                       | Launching VM Win2008                                          | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:08.817+02 | 2012-10-14 10
:28:29.411+02 | 2012-10-14 10:28:29.411+02 | 3add8280
 851b5227-41c9-425b-ae3b-392eed34612d | RunVm                       | Launching VM Windows7-09                                      | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:24.916+02 | 2012-10-14 10
:28:29.472+02 | 2012-10-14 10:28:29.472+02 | 7c6e4ba9
 42f048c6-af05-4f72-aef7-134ac69ab302 | DeactivateStorageDomain     | Deactivating Storage Domain iSCSI_EXPORT in Data Center iSCSI | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:26:18.978+02 | 2012-10-14 10
:26:32.12+02  | 2012-10-14 10:26:32.12+02  | 5006f002
 0f1b41ca-7fe0-4089-9817-85b14a8a2887 | SetNonOperationalVds        | Setting Host gold-vdsc to Non-Operational mode.               | FINISHED | 00000000-0000-0000-0000-000000000000 | t       | 2012-10-14 10:33:22.017+02 | 2012-10-14 10
:33:22.308+02 | 2012-10-14 10:33:23.272+02 | 7087cfab
 1ccd8dde-871d-4bf8-9b1a-cf5e5361486d | ActivateVds                 | Activating Host gold-vdsd                                     | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:28:32.649+02 | 2012-10-14 10
:28:33.26+02  | 2012-10-14 10:28:33.26+02  | 35a1c29a
 5c798658-626e-4cef-837e-2508840b265e | RunVm                       | Launching VM Windows7-06                                      | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:17.197+02 | 2012-10-14 10
:28:34.539+02 | 2012-10-14 10:28:34.539+02 | 3e2dca7b
 87f1da7b-66a4-4e0c-bc17-9f757acd9fb3 | RunVm                       | Launching VM Windows7-07                                      | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:19.502+02 | 2012-10-14 10
:28:34.718+02 | 2012-10-14 10:28:34.718+02 | 5b361331
 e994d502-bea4-4aa6-9c7c-b39809f190f1 | DeactivateStorageDomain     | Deactivating Storage Domain Dafna-01 in Data Center iSCSI     | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:46.329+02 | 2012-10-14 10
:28:44.668+02 | 2012-10-14 10:28:44.668+02 | 6db577c1
 11121969-1ee8-4f2f-836c-2436c0635a97 | DetachStorageDomainFromPool | Detaching Storage Domain iSCSI_EXPORT from Data Center iSCSI  | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:29:04.501+02 | 2012-10-14 10
:29:44.7+02   | 2012-10-14 10:29:44.7+02   | 7bdf33a1
 8f0fc529-57de-4669-b7ba-28704eddbef0 | DeactivateStorageDomain     | Deactivating Storage Domain Dafna_New2 in Data Center iSCSI   | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:48.586+02 | 2012-10-14 10
:28:34.463+02 | 2012-10-14 10:28:34.463+02 | 379cccf1
 2335ea92-ec04-4578-83a0-dd52389f4bf7 | RunVm                       | Launching VM Windows7-10                                      | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:27.655+02 | 2012-10-14 10
:28:34.582+02 | 2012-10-14 10:28:34.582+02 | 74e5927e
 4c876eca-d2b8-4177-86a5-853da268e3c5 | RunVm                       | Launching VM Win7                                             | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:13.859+02 | 2012-10-14 10
:28:34.669+02 | 2012-10-14 10:28:34.669+02 | 6bcb9c3e
 02321937-8dc2-450f-b111-dee0364d78b8 | RunVm                       | Launching VM Windows7-08                                      | FINISHED | fdfc627c-d875-11e0-90f0-83df133b58cc | t       | 2012-10-14 10:27:22.066+02 | 2012-10-14 10
:28:34.632+02 | 2012-10-14 10:28:34.632+02 | 1ca50fe

--- Additional comment from dron on 2012-10-14 04:43:41 EDT ---

Created attachment 626832 [details]
screen shots

--- Additional comment from bazulay on 2012-10-16 10:57:20 EDT ---

Currently we can't reproduce the above problem in development environment (si20).

Pending reproduce from QA.

--- Additional comment from dron on 2012-10-22 05:55:04 EDT ---

conclusion: the tasks appear under the Non-operational host task which is currently part of the design. 
I disagree with the location and think it should be fixed fro future versions.

--- Additional comment from yzaslavs on 2012-10-23 04:35:04 EDT ---

I think we're too late at 3.1 for such a design requirement.
I would suggest moving this to rhevm-future

--- Additional comment from bazulay on 2012-10-25 15:05:14 EDT ---

After discussing this issue again with Dafna we had agreed:

1 - the bug was fixed as designed.
2 - However from the usability side Dafna claims it is hard to observe and it 
    should be located under a different task.
3 - Hence moving this BZ back to ON_QA, to be verified according to the original 
    design (under Non-operational).
4 - Dafna will open a different RFE bug to discuss this issue in the future.

--- Additional comment from dron on 2012-10-28 08:27:02 EDT ---

verified on si22.1

Comment 1 Itamar Heim 2013-05-12 07:14:49 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.