Bug 886416

Summary:

VMs are not migrated properly when NFS storage is blocked on host

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

Pavel Stehlik <pstehlik>

Component:

ovirt-engine

Assignee:

Roy Golan <rgolan>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Ilanit Stein <istein>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

3.1.0

CC:

acathrow, dallan, dron, eedri, iheim, istein, jkt, juzhang, lpeer, mavital, michal.skrivanek, mkletzan, ofrenkel, rgolan, Rhev-m-bugs, vfeenstr, yeylon

Target Milestone:

---

Target Release:

3.3.0

Hardware:

All

OS:

Linux

Whiteboard:

virt

Fixed In Version:

is25

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

972675, 1045833

Bug Blocks:

1038284

Attachments:

Description	Flags
engine-migration	none

Description Pavel Stehlik 2012-12-12 09:10:39 UTC

Created attachment 662166 [details]
engine-migration

Description of problem:
This BZ was created when trying to reproduce bug 851837 .
 Have 3 hosts, 20VMs on SPM, block storage for SPM & one HSM (iptables) - both these have SPM priority Normal (the 3rd host has Low).  5VMs were correctly migrated to new host (speed was quite normal). SPM election passed. For ~25mins in gui were 15Vms in status 'Migration from..' & with hourglass. Was night, so in the morning was situation as below:
5VMs correctly on new host 
5VMs stayed on old spm (paused state)
10Vms were migrated to new host (paused state)

Environment:
===============
srh-03 (10.34.63.208)
- old SPM was weaker than new SPM (cpu/ram) act as source for migration.
- vdsm-4.9.6-44.1.el6_3.x86_64 (RHEL)
[root@srh-03 ~]# virsh -r list
 Id    Name                           State
----------------------------------------------------
 5     pulik-5                        paused
...(4 more Vms here)
[root@srh-03 ~]# virsh -r domstate --reason 5
paused (I/O error)

slot-6 (10.34.63.136)
- new SPM (destination for migration)
- vdsm-4.9.6-44.0.el6_3.x86_64 (RHEVH)
virsh list
 19    pulik-12                       running
 21    pulik-7                        paused
...
virsh -r domstate --reason 23
paused (user)




Version-Release number of selected component (if applicable):
si24.5 (GA), 

How reproducible:


Steps to Reproduce:
1. 3 hosts, 20Vms on SPM
2. fence storage for SPM + 1 another
3.
  
Actual results:
Not all VMs migrated.

Expected results:
All VMs should migrate.

Additional info:
full sosreport available at http://10.34.63.202/errors/mig-stor-fence/ or http://10.34.63.204/errors/mig-stor-fence/

engine.log (attached):
=========================
...
2012-12-10 18:54:17,794 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-15) [3461d827] domain e16fb191-f302-4803-a795-6db966cb49fe:str01-ps05 in problem. vds: slot-8.rhev.lab.eng.brq.redhat.com
2012-12-10 18:55:00,001 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-52) [78983ae4] Checking autorecoverable hosts
2012-12-10 18:55:00,006 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-52) [78983ae4] Autorecovering 0 hosts
2012-12-10 18:55:00,006 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-52) [78983ae4] Checking autorecoverable hosts done
2012-12-10 18:55:00,006 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-52) [78983ae4] Autorecovering storage domains is disabled, skipping
2012-12-10 18:55:00,297 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-74) [62f00867] ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = c11bba8e-3fa2-11e2-a147-001a4a013f65 : srh-208, VDS Network Error, continuing.
VDSNetworkException:
2012-12-10 18:55:02,330 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-18) [51c64962] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand return value
 org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturnForXmlRpc@649e199e
2012-12-10 18:55:02,330 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-18) [51c64962] HostName = srh-208
2012-12-10 18:55:02,330 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (QuartzScheduler_Worker-18) [51c64962] Command GetCapabilitiesVDS execution failed. Error: VDSRecoveringException: Recovering from crash or Initializing
2012-12-10 18:55:02,357 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-18) [51c64962] ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = c11bba8e-3fa2-11e2-a147-001a4a013f65 : srh-208, error = Recovering from crash or Initializing, continuing.
2012-12-10 18:55:04,383 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-25) [46095d74] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand return value
 org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturnForXmlRpc@4b958ff4
2012-12-10 18:55:04,383 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-25) [46095d74] HostName = srh-208
2012-12-10 18:55:04,383 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (QuartzScheduler_Worker-25) [46095d74] Command GetCapabilitiesVDS execution failed. Error: VDSRecoveringException: Recovering from crash or Initializing
2012-12-10 18:55:05,544 INFO  [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (QuartzScheduler_Worker-29) [2af80459] Running command: SetStoragePoolStatusCommand internal: true. Entities affected :  ID: f410a3f8-0701-4b2b-a306-fab0dceae5e0 Type: StoragePool
2012-12-10 18:55:06,949 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-36) [41b01c92] Cancelling the recovery from crash timer for VDS c11bba8e-3fa2-11e2-a147-001a4a013f65 because vds started initializing
2012-12-10 18:55:06,951 INFO  [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (QuartzScheduler_Worker-36) [37f97b0a] Running command: InitVdsOnUpCommand internal: true.
2012-12-10 18:55:07,010 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ValidateStorageServerConnectionVDSCommand] (QuartzScheduler_Worker-36) [3e1e4650] START, ValidateStorageServerConnectionVDSCommand(HostName = srh-208, HostId = c11bba8e-3fa2-11e2-a147-001a4a013f65, storagePoolId = f410a3f8-0701-4b2b-a306-fab0dceae5e0, storageType = NFS, connectionList = [{ id: 002a62aa-19e3-4d99-b103-e3ddaba2fb27, connection: 10.34.63.202:/mnt/export/nfs/lv1/pstehlik/nfs05, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 60c773e0
2012-12-10 18:55:07,026 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ValidateStorageServerConnectionVDSCommand] (QuartzScheduler_Worker-36) [3e1e4650] FINISH, ValidateStorageServerConnectionVDSCommand, return: {002a62aa-19e3-4d99-b103-e3ddaba2fb27=0}, log id: 60c773e0
2012-12-10 18:55:07,026 INFO  [org.ovirt.engine.core.bll.storage.ConnectHostToStoragePoolServersCommand] (QuartzScheduler_Worker-36) [3e1e4650] Running command: ConnectHostToStoragePoolServersCommand internal: true. Entities affected :  ID: f410a3f8-0701-4b2b-a306-fab0dceae5e0 Type: StoragePool
2012-12-10 18:55:07,028 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (QuartzScheduler_Worker-36) [3e1e4650] START, ConnectStorageServerVDSCommand(HostName = srh-208, HostId = c11bba8e-3fa2-11e2-a147-001a4a013f65, storagePoolId = f410a3f8-0701-4b2b-a306-fab0dceae5e0, storageType = NFS, connectionList = [{ id: 002a62aa-19e3-4d99-b103-e3ddaba2fb27, connection: 10.34.63.202:/mnt/export/nfs/lv1/pstehlik/nfs05, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 5d973d03
2012-12-10 18:55:07,044 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (QuartzScheduler_Worker-36) [3e1e4650] FINISH, ConnectStorageServerVDSCommand, return: {002a62aa-19e3-4d99-b103-e3ddaba2fb27=0}, log id: 5d973d03
2012-12-10 18:55:07,044 INFO  [org.ovirt.engine.core.bll.storage.ConnectHostToStoragePoolServersCommand] (QuartzScheduler_Worker-36) [3e1e4650] Host srh-208 storage connection was succeeded
2012-12-10 18:55:07,058 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (QuartzScheduler_Worker-36) [3e1e4650] START, ConnectStoragePoolVDSCommand(HostName = srh-208, HostId = c11bba8e-3fa2-11e2-a147-001a4a013f65, storagePoolId = f410a3f8-0701-4b2b-a306-fab0dceae5e0, vds_spm_id = 2, masterDomainId = e16fb191-f302-4803-a795-6db966cb49fe, masterVersion = 1), log id: 79b365c8
....

Comment 1 Pavel Stehlik 2012-12-12 09:12:58 UTC

*** Bug 851837 has been marked as a duplicate of this bug. ***

Comment 2 Simon Grinberg 2012-12-25 12:21:35 UTC

(In reply to comment #0)


> Expected results:
> All VMs should migrate.

Per KVM team in other BZ, VM should not be migrated if already paused on EIO.
So the limit of concurrent migrations to 5 + the logic we added to block migration of paused VMs may lead to the scenario abov

Comment 3 Simon Grinberg 2012-12-25 14:36:15 UTC

(In reply to comment #2)
> (In reply to comment #0)
> 
> 
> > Expected results:
> > All VMs should migrate.
> 
> Per KVM team in other BZ, VM should not be migrated if already paused on EIO.
> So the limit of concurrent migrations to 5 + the logic we added to block
> migration of paused VMs may lead to the scenario abov

PS, my problem is more with the VMs that migrated as PAUSED - continuing these may lead to image corruption according to Dor.

Comment 4 Roy Golan 2013-01-02 14:35:30 UTC

(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #0)
> > 
> > 
> > > Expected results:
> > > All VMs should migrate.
> > 
> > Per KVM team in other BZ, VM should not be migrated if already paused on EIO.
> > So the limit of concurrent migrations to 5 + the logic we added to block
> > migration of paused VMs may lead to the scenario abov
> 
> PS, my problem is more with the VMs that migrated as PAUSED - continuing
> these may lead to image corruption according to Dor.

Engine doesn't migrate PAUSED VMs and Pavel stated the he saw the VMs in MigratingFrom.
could it be that VMs that didn't write to the disk started migrating and after the migration they tried and then stopped.

Comment 5 Simon Grinberg 2013-01-03 18:32:42 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > (In reply to comment #0)
0> > PS, my problem is more with the VMs that migrated as PAUSED - continuing
> > these may lead to image corruption according to Dor.
> 
> Engine doesn't migrate PAUSED VMs and Pavel stated the he saw the VMs in
> MigratingFrom.
> could it be that VMs that didn't write to the disk started migrating and
> after the migration they tried and then stopped.

I suspect it could, and we may want to fail the migration in this case and leave the source VM in a pause state. If this is the case then this can only be done at the VDSM level. 

Michal, Andy your take on this?

Comment 6 Michal Skrivanek 2013-01-13 16:42:58 UTC

Moving to vdsm would probably help a bit, but doesn't solve the problem completely.

we would need libvirt to fail this. Because even when at vdsm level the src VM can be healty, we issue migrate, and during the migration the src VM gets to EIO. Right now libvirt will complete the migration nevertheless
Dave?

Simon,
is it even worth it to try to migrate away? It would always be a question of luck if the VM did access the storage or not before the migration finishes. IMHO not very predictable behavior.

Comment 7 Simon Grinberg 2013-01-13 17:18:53 UTC

(In reply to comment #6)
> Moving to vdsm would probably help a bit, but doesn't solve the problem
> completely.
> 
> 
> Simon,
> is it even worth it to try to migrate away? It would always be a question of
> luck if the VM did access the storage or not before the migration finishes.
> IMHO not very predictable behavior.

If the VM is already in EIO Pause - no
Now, you probably mean is it worth it to try to migrate away on non-operational host that was caused by storage connectivity. 

The answer to the above is not strait forward there will be votes for any decisions we may take (This is why we added the cluster policy of 'do not auto migrate'. However since we may encounter storage disconnect even for VMs that have started migration from any other reason, then we need to handle the general case anyhow.

Comment 9 Michal Skrivanek 2013-01-15 14:48:53 UTC

related to qemu issue discussed in BZ#665820
no easy way how to avoid it as libvirt suffers from the same as rhev-m engine

one idea is to check the status of the src vm once the last RAM page is transferred and src CPU is paused - but before the it is destroyed. If we get this far without EIO/ENOSPACE it's safe to destroy the src and complete migration, otherwise the dst needs to be abandoned

Comment 10 Martin Kletzander 2013-01-17 09:34:01 UTC

Hi Michal,

I'm suggesting immediately pausing all domains affected by the loss of connectivity to the storage.  That way you will keep the number of non-migrated domains at minimum.

In that case, the only domains ending in error would be those that started I/O after the storage was disconnected, but before oVirt noticed that (the pause was issued).  I'm looking into modifying libvirt in a way that would help oVirt decide whether the migration is safe or not (without flooding libvirt with unnecessary API calls).

The time we get EIO from QEMU is a little non-deterministic from my POV, but I'm pretty sure that when the above problem appears (guest performs I/O on detached storage before the pause is issued), it is easy to know that it was paused due to that fact and not just because of the request.  This information is saved along with the domain status ('virsh domstate <domain> --reason' reports that, for example).

Is that understandable?

Comment 11 Roy Golan 2013-01-20 09:33:30 UTC

(In reply to comment #10)
> Hi Michal,
> 
> I'm suggesting immediately pausing all domains affected by the loss of
> connectivity to the storage.  That way you will keep the number of
> non-migrated domains at minimum.

sound aggressive because your pausing also VMs that doesn't do I/O on that particular domain. 
so every storage disruption will pause the VMs unnecessarily.

> 
> In that case, the only domains ending in error would be those that started
> I/O after the storage was disconnected, but before oVirt noticed that (the
> pause was issued).  I'm looking into modifying libvirt in a way that would
> help oVirt decide whether the migration is safe or not (without flooding
> libvirt with unnecessary API calls).
> 
> The time we get EIO from QEMU is a little non-deterministic from my POV, but
> I'm pretty sure that when the above problem appears (guest performs I/O on
> detached storage before the pause is issued), it is easy to know that it was
> paused due to that fact and not just because of the request.  This
> information is saved along with the domain status ('virsh domstate <domain>
> --reason' reports that, for example).
> 
> Is that understandable?

Comment 12 Martin Kletzander 2013-01-21 13:42:45 UTC

(In reply to comment #11)
I don't feel it's aggressive at all.  Of course you are pausing the machines that haven't touched NFS at all, but that's what you want.  Let me explain in a different way.

When host A is disconnected from the NFS share and starts migrating to host B, all the machines that were performing I/O on the NFS, while oVirt realized it's disconnected, are lost already (will be paused due to EIO).  This we can't fix.  Then there are the other machines that are OK and you start migrating them immediately, but not pause them.  Since they are not paused, nobody can assure you they won't start any communication with storage before they are completely migrated away (unless QEMU adds something like "pause_only_storage" command), so these look OK, but we can lose them anyway.  However, if you pause even those machines that don't perform any I/O on the disconnected storage, you can be sure there will be no other machine lost due the storage disconnection.

Comment 13 Roy Golan 2013-01-21 15:24:45 UTC

(In reply to comment #12)
> (In reply to comment #11)
> I don't feel it's aggressive at all.  Of course you are pausing the machines
> that haven't touched NFS at all, but that's what you want.  Let me explain
> in a different way.
> 
> When host A is disconnected from the NFS share and starts migrating to host
> B, all the machines that were performing I/O on the NFS, while oVirt
> realized it's disconnected, are lost already (will be paused due to EIO). 
> This we can't fix.  Then there are the other machines that are OK and you
> start migrating them immediately, but not pause them.  Since they are not
> paused, nobody can assure you they won't start any communication with
> storage before they are completely migrated away (unless QEMU adds something
> like "pause_only_storage" command), so these look OK, but we can lose them
> anyway.  However, if you pause even those machines that don't perform any
> I/O on the disconnected storage, you can be sure there will be no other
> machine lost due the storage disconnection.

Your point is clear, thanks. 
worth a policy based decision in that case because what if the I/O problem is local to the host itself? by migrating the VMs you save them from pausing. 

Ovirt can set a host to be non-operational if it detects the host has latency in I/O on its storage domains so based on that we would like to complete the migration to the dest. 
I maybe missing something though - Simon?

Comment 14 Roy Golan 2013-01-21 15:34:52 UTC

was referring to Resilience Policy at Cluster level

Comment 15 Simon Grinberg 2013-01-21 15:46:42 UTC

(In reply to comment #13)
> I maybe missing something though - Simon?

We need to think here on all possible scenarios.
I tend to agree that if a storage domain is detected as unresponsive we should immediately PAUSE all the VMs that are using it - this probably should be done even on the VDSM level. Those machine, if not PAUSED on EIO, should be safe to migrate. This way you may end up with some VM that are paused which may have otherwise continued, but with many others VMs that where practically saved since they where properly paused before getting hit, and now may be safely migrated and continue to run on a different host - that is if the failure is local.

Comment 16 Michal Skrivanek 2013-07-12 08:44:02 UTC

this needs minor code change and proper testing once bug 972675 gets in, hence proposing exception

Comment 17 Michal Skrivanek 2013-09-17 06:47:27 UTC

see bug 961154 for the corresponding vdsm change. Here we need on engine side at least a VdcOptions config for 3.3 clusters to send a new flag

Comment 19 Ilanit Stein 2013-12-12 13:37:40 UTC

Michal,

Can you please advise how to verify this bug?

I tried to block storage (iptables -A OUTPUT -d <storage server> -j REJECT)
on source host, VM event "VM not responding" appeared, then VM changed from up to "?", then it was migrated (automatically), and on destination it was in status pause.

Comment 20 Michal Skrivanek 2013-12-12 13:42:11 UTC

That's wrong...can you please look at that with Vinzenz. It should not appear as Paused on destination. Pls check the abort on eio option (3.3 cluster level engine-config option)

Comment 21 Ilanit Stein 2013-12-12 14:00:37 UTC

abort on eio option:

# engine-config -g AbortMigrationOnError
AbortMigrationOnError: true version: 3.3
AbortMigrationOnError: false version: 3.2
AbortMigrationOnError: false version: 3.1
AbortMigrationOnError: false version: 3.0

Also on source host vdsm.log this flag is seen: 'abortOnError': 'true'

Comment 22 Roy Golan 2013-12-12 15:35:57 UTC

Pete mentioned in his patch that libvirt supports this flag since version 1.0.1 upstream

I checked cyan-vdse.qa.lab and its libvirt version is 0.10.2

Comment 23 Vinzenz Feenstra [evilissimo] 2013-12-13 07:16:04 UTC

(In reply to Roy Golan from comment #22)
> Pete mentioned in his patch that libvirt supports this flag since version
> 1.0.1 upstream
> 
> I checked cyan-vdse.qa.lab and its libvirt version is 0.10.2

The question is if this is RHEL, which by the version it looks like it is, contains this feature as a backport. Unfortunately you can't go by upstream versions with features when it comes to RHEL

Comment 25 Ilanit Stein 2013-12-15 14:04:49 UTC

thanks Eyal.  In order to verify this 3.3 bug, which deals with backend add abortOnError flag, I first need to know that libvirt (libvirt-0.10.2-29.el6.1.x86_64) supports this flag - where can I find this info on downstream libvirt please?

Comment 28 Vinzenz Feenstra [evilissimo] 2013-12-18 12:13:48 UTC

I see that this seems not to be working.

When I am dropping all packets from the storage server on the source host side then the migration still continues, however the VM gets paused on the source. 

However the VM migration does not get finalized and stays stuck at about 99% and eventually should get aborted due to our no-progress timeout, however even that does not happen.

If I allow then the connection to the storage server again suddenly we get a bunch of events and surprisingly even though we called abortJob the migration suceeds and the VM is destination on the target, and eventually happened to run again on the destination.


To me it looks like that libvirt got stuck because of the EIO, however the abortonerror flag definitely does NOT work.


These are the events I was referring to:

Thread-197::DEBUG::2013-12-17 16:24:10,112::fileSD::154::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/10.35.64.102:_volumes_wolf_data__0__nfs__2013082717821706028/7ac99a5a-1542-4757-b7fc-c43ac2caf8f4
libvirtEventLoop::INFO::2013-12-17 16:24:10,112::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,120::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
Thread-122::DEBUG::2013-12-17 16:24:10,122::fileSD::154::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/10.35.64.102:_volumes_wolf_export__istein__0__nfs__2013110512489493611/82d5b7d4-100b-4073-9496-cc3810c24168
libvirtEventLoop::INFO::2013-12-17 16:24:10,123::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,129::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,130::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,130::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,130::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,130::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,130::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,131::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,131::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,131::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,131::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,132::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,132::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,132::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,132::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,133::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,133::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,133::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,133::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,133::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,134::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,134::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,134::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,134::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,135::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,135::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,135::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,135::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,135::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio
libvirtEventLoop::INFO::2013-12-17 16:24:10,136::vm::4366::vm.Vm::(_onAbnormalStop) vmId=`f7afb5e9-9f6a-4cca-8ef5-7d02e9a76969`::abnormal vm stop device virtio-disk0 error eio

Comment 32 Ilanit Stein 2013-12-22 09:40:17 UTC

Moving this bug to verified, since flag 'abortOnError': 'true' is seen in vdsm.log
Also putting depends on Bug 972675 (The related libvirt bug, that suppose to respond to this flag).

Comment 33 Itamar Heim 2014-01-21 22:21:41 UTC

Closing - RHEV 3.3 Released

Comment 34 Itamar Heim 2014-01-21 22:27:00 UTC

Closing - RHEV 3.3 Released