Bug 851840 - ovirt-engine-backend: we are trying to migrate vm's although GetCapabilitiesVDS returns with Recovering from crash or Initializing on NFS storage type
Summary: ovirt-engine-backend: we are trying to migrate vm's although GetCapabilitiesV...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Omer Frenkel
QA Contact: Dafna Ron
URL:
Whiteboard: virt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-26 11:59 UTC by Dafna Ron
Modified: 2012-12-04 20:07 UTC (History)
9 users (show)

Fixed In Version: si18
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-04 20:07:22 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.23 MB, application/x-gzip)
2012-08-26 11:59 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2012-08-26 11:59:37 UTC
Created attachment 607055 [details]
logs

Description of problem:

I blocked the storage domain from the host using iptables with storage type NFS.
backend attempts to migrate the vm's although GetCapabilitiesVDS returns with Error: VDSRecoveringException: Recovering from crash or Initializing
since vdsm is not responding yet the migration will fail so there is no point in sending MigrateBrokerVDSCommand until vdsm is responding. 

Version-Release number of selected component (if applicable):

si15
vdsm-4.9.6-30.0.el6_3.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create a setup with a two hosts cluster on NFS storage with running+writing vm's running in the SPM host. 
2. block connectivity to the Storage domain from the SPM
3.
  
Actual results:

backend is sending GetCapabilitiesVDS to the host which returns with Error since the vdsm is reinitializing and yet we are still sending a MigrateBrokerVDSCommand on all vm's (which will fail since vdsm is not responding). 

Expected results:

as long as we are getting error on GetCapabilitiesVDS there is no point in migrating. 

Additional info: logs 

2012-08-26 14:40:04,913 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (QuartzScheduler_Worker-17) [56410677] Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: 9c588ba2-ec35-11e1-a1e6-0
01a4a169741 Type: VDS
2012-08-26 14:40:04,913 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-17) [56410677] START, SetVdsStatusVDSCommand(vdsId = 9c588ba2-ec35-11e1-a1e6-001a4a169741, status=NonOperational, nonOperatio
nalReason=TIMEOUT_RECOVERING_FROM_CRASH), log id: 33f42fc3
2012-08-26 14:40:04,916 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-17) [56410677] FINISH, SetVdsStatusVDSCommand, log id: 33f42fc3
2012-08-26 14:40:05,146 INFO  [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-4-thread-44) [3cb2804a] Running command: InternalMigrateVmCommand internal: true. Entities affected :  ID: 50737895-2cee-42aa-8aaf-734e7891a99b Typ
e: VM
2012-08-26 14:40:05,156 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (pool-4-thread-44) [3cb2804a] START, MigrateVDSCommand(vdsId = 9c588ba2-ec35-11e1-a1e6-001a4a169741, vmId=50737895-2cee-42aa-8aaf-734e7891a99b, srcHost=go
ld-vdsd.qa.lab.tlv.redhat.com, dstVdsId=35b5ed18-ed2a-11e1-b9a6-001a4a169741, dstHost=nott-vds2.qa.lab.tlv.redhat.com:54321, migrationMethod=ONLINE), log id: 5c00ed23
2012-08-26 14:40:05,157 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (pool-4-thread-44) [3cb2804a] VdsBroker::migrate::Entered (vm_guid=50737895-2cee-42aa-8aaf-734e7891a99b, srcHost=gold-vdsd.qa.lab.tlv.redh
at.com, dstHost=nott-vds2.qa.lab.tlv.redhat.com:54321,  method=online
2012-08-26 14:40:05,157 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (pool-4-thread-44) [3cb2804a] START, MigrateBrokerVDSCommand(vdsId = 9c588ba2-ec35-11e1-a1e6-001a4a169741, vmId=50737895-2cee-42aa-8aaf-73
4e7891a99b, srcHost=gold-vdsd.qa.lab.tlv.redhat.com, dstVdsId=35b5ed18-ed2a-11e1-b9a6-001a4a169741, dstHost=nott-vds2.qa.lab.tlv.redhat.com:54321, migrationMethod=ONLINE), log id: 6536193f
2012-08-26 14:40:05,178 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-44) [3cb2804a] Command org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand return value 
 Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         99
mMessage                      Recovering from crash or Initializing

Comment 2 Michal Skrivanek 2012-08-30 10:50:30 UTC
is this really regression?

Comment 3 Haim 2012-08-30 10:54:41 UTC
(In reply to comment #2)
> is this really regression?

not sure. removing this flag till proven otherwise.

Comment 4 Omer Frenkel 2012-08-30 13:57:36 UTC
[removed regression as this behaviour hasn't changed]

Comment 5 Omer Frenkel 2012-08-30 15:07:47 UTC
http://gerrit.ovirt.org/#/c/7638/

Comment 8 Dafna Ron 2012-09-19 12:09:10 UTC
verified on si18
migration started only after return getCapabilities
backend log shows that we are waiting for reinitialize: 
2012-09-19 14:59:59,147 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-53) [5c548fd0] Waiting to Host gold-vdsc to finish initialization for 50 Sec.


Note You need to log in before you can comment on or make changes to this bug.