Bug 1004184

Summary: Stateless VM with memory state snapshot fails to run due to "VolumeError: Bad volume specification"
Product: Red Hat Enterprise Virtualization Manager Reporter: Gadi Ickowicz <gickowic>
Component: ovirt-engineAssignee: Arik <ahadas>
Status: CLOSED CURRENTRELEASE QA Contact: Gadi Ickowicz <gickowic>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: acathrow, amureini, bazulay, iheim, lpeer, michal.skrivanek, nlevinki, Rhev-m-bugs, scohen, yeylon
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.3.0Flags: amureini: Triaged+
Hardware: x86_64   
OS: Linux   
Whiteboard: virt
Fixed In Version: is24 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 22:16:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1038284    
Attachments:
Description Flags
vdsm + engine logs none

Description Gadi Ickowicz 2013-09-04 07:37:05 UTC
Created attachment 793511 [details]
vdsm + engine logs

Description of problem:
When starting a stateless vm with memory state the vm fails to start due to:
Thread-37491::ERROR::2013-09-04 09:29:32,227::API::181::vds::(create) Error restoring VM parameters
Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 167, in create
    fname = self._cif.prepareVolumePath(paramFilespec)
  File "/usr/share/vdsm/clientIF.py", line 280, in prepareVolumePath
    raise vm.VolumeError(drive)
VolumeError: Bad volume specification {'device': 'disk', 'imageID': '0bf1176b-b817-4ceb-9f7f-0b507edcac68', 'domainID': '30508168-e57a-49f1-a4c2-863781b0e58a', 'volumeID': 'b742e199-4da5
-4de9-bc83-5710268e7f20', 'poolID': '529e116b-ef0b-41f4-bbe2-e5b6207ae990'}


however the logs show the lvcreate command:
[root@aqua-vds4 tmp]# egrep 'lvcreate|VolumeE' ./vdsm.log.1  | grep b74
c2e4c5f9-d27f-4352-8a93-8d45314698d6::DEBUG::2013-09-04 09:21:24,602::lvm::314::Storage.Misc.excCmd::(cmd) '/usr/bin/sudo -n /sbin/lvm lvcreate --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a|/dev/mapper/1gickowic-lun113637722|/dev/mapper/1gickowic-lun213637722|/dev/mapper/1storage_sefi1213764757|\', \'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --autobackup n --contiguous n --size 1024m --addtag OVIRT_VOL_INITIALIZING --name b742e199-4da5-4de9-bc83-5710268e7f20 30508168-e57a-49f1-a4c2-863781b0e58a' (cwd None)


but checking on the host after the error the lv is not visible:
[root@aqua-vds4 ~]# lvs -a
  LV                                   VG                                   Attr      LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  1256486b-aa69-49a1-a9aa-0eb1f7abec34 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  1e41f986-76e0-482e-a1db-b8ff49b8a4c9 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   2.00g                                             
  325aad0e-e9d6-43bf-a2c5-9f88a11dbf94 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  44b4e1b8-ef92-49a1-bd92-95e0ecebf481 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  4c22262b-35e0-4d5a-acf3-9a35c7c1bff7 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  55345c24-377a-4d86-a7e0-b89cafcaec42 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  5d5ef436-fa4d-4fb3-bc51-319a694f7b18 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  663c1484-8e7c-4338-b850-5402f76f24e3 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   4.88g                                             
  751d6d9b-ed26-4f56-ad39-8117b7322176 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  8566890c-2d9b-4ffe-8f7a-e0e4780ff613 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  9eb7b7c0-3d97-45c7-abb2-f7351dfb0bb4 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  ba671459-ef17-4362-ab4e-0a05001bcee9 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.00g                                             
  cf47a6f5-d0bc-4525-8302-81cf9e0f059c 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   5.00g                                             
  e017e890-b58e-4825-84d8-3cca93fbf629 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   2.00g                                             
  e80253fd-5f43-46f3-bd92-61dcdb6b749a 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.38g                                             
  fdab101b-8628-4a02-834f-c26b2086c4e4 30508168-e57a-49f1-a4c2-863781b0e58a -wi------   1.38g                                             
  ids                                  30508168-e57a-49f1-a4c2-863781b0e58a -wi-ao--- 128.00m                                             
  inbox                                30508168-e57a-49f1-a4c2-863781b0e58a -wi-a---- 128.00m                                             
  leases                               30508168-e57a-49f1-a4c2-863781b0e58a -wi-a----   2.00g                                             
  master                               30508168-e57a-49f1-a4c2-863781b0e58a -wi-ao---   1.00g                                             
  metadata                             30508168-e57a-49f1-a4c2-863781b0e58a -wi-a---- 512.00m                                             
  outbox                               30508168-e57a-49f1-a4c2-863781b0e58a -wi-a---- 128.00m                                             
  ids                                  9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------ 128.00m                                             
  inbox                                9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------ 128.00m                                             
  leases                               9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------   2.00g                                             
  master                               9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------   1.00g                                             
  metadata                             9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------ 512.00m                                             
  outbox                               9d0988f7-ac85-4dfd-b933-8bb7cba67d2f -wi------ 128.00m                                             
  lv_root                              vg0                                  -wi-ao--- 457.71g                                             
  lv_swap                              vg0                                  -wi-ao---   7.85g   

Version-Release number of selected component (if applicable):
vdsm-4.12.0-92.gita04386d.el6ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create snapshot with memory state of a vm with OS installed on it
2. Commit to snapshot and set vm to stateless 
3. Start vm

Actual results:
VM fails to start

Expected results:
VM should start and resume memory state of the snapshot

Additional info:

Comment 1 Gadi Ickowicz 2013-09-04 08:52:44 UTC
After failing to launch the vm using the scenario described above, removing the "stateless" setting from the VM and attempting to restore the snapshot that fails as well with a Volume does not exist error. 

The snapshot that the stateless snapshot was based on is now unusable:

Thread-40759::ERROR::2013-09-04 10:57:32,935::task::850::TaskManager.Task::(_setError) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3228, in prepareImage
    imgVolumes = img.prepare(sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/image.py", line 235, in prepare
    chain = self.getChain(sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/image.py", line 171, in getChain
    srcVol = volclass(self.repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 86, in __init__
    volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/volume.py", line 140, in __init__
    self.validate()
  File "/usr/share/vdsm/storage/blockVolume.py", line 94, in validate
    raise se.VolumeDoesNotExist(self.volUUID)  # Fix me
VolumeDoesNotExist: Volume does not exist: ('860403b6-f979-4328-b8cd-852841d9e963',)
Thread-40759::DEBUG::2013-09-04 10:57:32,937::task::869::TaskManager.Task::(_run) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::Task._run: 6f978fb8-8fef-4c38-9529-1e77c146add6 ('30508168-
e57a-49f1-a4c2-863781b0e58a', '529e116b-ef0b-41f4-bbe2-e5b6207ae990', '93cfa566-d997-4278-ae12-4ae9c149ab98', '860403b6-f979-4328-b8cd-852841d9e963') {} failed - stopping task
Thread-40759::DEBUG::2013-09-04 10:57:32,937::task::1194::TaskManager.Task::(stop) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::stopping in state preparing (force False)
Thread-40759::DEBUG::2013-09-04 10:57:32,937::task::974::TaskManager.Task::(_decref) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::ref 1 aborting True
Thread-40759::INFO::2013-09-04 10:57:32,938::task::1151::TaskManager.Task::(prepare) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::aborting: Task is aborted: 'Volume does not exist' - cod
e 201
Thread-40759::DEBUG::2013-09-04 10:57:32,938::task::1156::TaskManager.Task::(prepare) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::Prepare: aborted: Volume does not exist
Thread-40759::DEBUG::2013-09-04 10:57:32,938::task::974::TaskManager.Task::(_decref) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::ref 0 aborting True
Thread-40759::DEBUG::2013-09-04 10:57:32,938::task::909::TaskManager.Task::(_doAbort) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::Task._doAbort: force False
Thread-40759::DEBUG::2013-09-04 10:57:32,939::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-40759::DEBUG::2013-09-04 10:57:32,939::task::579::TaskManager.Task::(_updateState) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::moving from state preparing -> state aborting
Thread-40759::DEBUG::2013-09-04 10:57:32,939::task::534::TaskManager.Task::(__state_aborting) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::_aborting: recover policy none
Thread-40759::DEBUG::2013-09-04 10:57:32,939::task::579::TaskManager.Task::(_updateState) Task=`6f978fb8-8fef-4c38-9529-1e77c146add6`::moving from state aborting -> state failed
Thread-40759::DEBUG::2013-09-04 10:57:32,939::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.30508168-e57a-49f1-a4c2-863781b0e
58a': < ResourceRef 'Storage.30508168-e57a-49f1-a4c2-863781b0e58a', isValid: 'True' obj: 'None'>}
Thread-40759::DEBUG::2013-09-04 10:57:32,940::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-40759::DEBUG::2013-09-04 10:57:32,940::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.30508168-e57a-49f1-a4c2-863781b0e58a'
Thread-40759::DEBUG::2013-09-04 10:57:32,941::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.30508168-e57a-49f1-a4c2-863781b0e58a' (0 active users)
Thread-40759::DEBUG::2013-09-04 10:57:32,941::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.30508168-e57a-49f1-a4c2-863781b0e58a' is free, finding out if anyone is waiting for it.
Thread-40759::DEBUG::2013-09-04 10:57:32,941::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.30508168-e57a-49f1-a4c2-863781b0e58a', Clearing records.
Thread-40759::ERROR::2013-09-04 10:57:32,941::dispatcher::67::Storage.Dispatcher.Protect::(run) {'status': {'message': "Volume does not exist: ('860403b6-f979-4328-b8cd-852841d9e963',)", 'code': 201}}

Comment 3 Ayal Baron 2013-09-22 07:28:20 UTC
(In reply to Gadi Ickowicz from comment #0)
> Created attachment 793511 [details]
> vdsm + engine logs
> 
> Description of problem:
> When starting a stateless vm with memory state the vm fails to start due to:
> Thread-37491::ERROR::2013-09-04 09:29:32,227::API::181::vds::(create) Error
> restoring VM parameters
> Traceback (most recent call last):
>   File "/usr/share/vdsm/API.py", line 167, in create
>     fname = self._cif.prepareVolumePath(paramFilespec)
>   File "/usr/share/vdsm/clientIF.py", line 280, in prepareVolumePath
>     raise vm.VolumeError(drive)
> VolumeError: Bad volume specification {'device': 'disk', 'imageID':
> '0bf1176b-b817-4ceb-9f7f-0b507edcac68', 'domainID':
> '30508168-e57a-49f1-a4c2-863781b0e58a', 'volumeID': 'b742e199-4da5
> -4de9-bc83-5710268e7f20', 'poolID': '529e116b-ef0b-41f4-bbe2-e5b6207ae990'}
> 
> 
> however the logs show the lvcreate command:
> [root@aqua-vds4 tmp]# egrep 'lvcreate|VolumeE' ./vdsm.log.1  | grep b74
> c2e4c5f9-d27f-4352-8a93-8d45314698d6::DEBUG::2013-09-04
> 09:21:24,602::lvm::314::Storage.Misc.excCmd::(cmd) '/usr/bin/sudo -n
> /sbin/lvm lvcreate --config " devices { preferred_names =
> [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
> disable_after_error_count=3 filter = [
> \'a|/dev/mapper/1gickowic-lun113637722|/dev/mapper/1gickowic-lun213637722|/
> dev/mapper/1storage_sefi1213764757|\', \'r|.*|\' ] }  global { 
> locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup { 
> retain_min = 50  retain_days = 0 } " --autobackup n --contiguous n --size
> 1024m --addtag OVIRT_VOL_INITIALIZING --name
> b742e199-4da5-4de9-bc83-5710268e7f20 30508168-e57a-49f1-a4c2-863781b0e58a'
> (cwd None)
> 
> 

The log also shows the deleteImage that removed it 8 seconds *before* running it:

Thread-37484::INFO::2013-09-04 09:29:24,050::logUtils::44::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='30508168-e57a-49f1-a4c2-863781b0e58a', spUUID='529e116b-ef0b-41f4-bbe2-e5b6207ae990', imgUUID='0bf1176b-b817-4ceb-9f7f-0b507edcac68', postZero='false', force='false')

So the question is why did engine delete the image before running it (I'm guessing it has to do with the VM being stateless).

Comment 4 Tal Nisan 2013-10-27 11:30:40 UTC
Seems the system is misbehaving in the scenario of running a stateless VM on top of a memory snapshot, Michal can one of your guys please take a look?

Comment 7 Gadi Ickowicz 2013-11-25 06:41:17 UTC
Verified on is24.1 - stateless VM correctly resumed memory state from snapshot.

Comment 8 Itamar Heim 2014-01-21 22:16:08 UTC
Closing - RHEV 3.3 Released

Comment 9 Itamar Heim 2014-01-21 22:23:03 UTC
Closing - RHEV 3.3 Released