Bug 807768 - ovirt-engine-backend: orphan object remains in SD if we remove vm that has a disk on the SD while its in maintenance
ovirt-engine-backend: orphan object remains in SD if we remove vm that has a ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.1.0
x86_64 Linux
high Severity high
: ---
: 3.1.0
Assigned To: Tal Nisan
Dafna Ron
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-28 12:36 EDT by Dafna Ron
Modified: 2016-02-10 12:11 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logs (1.26 MB, application/x-gzip)
2012-03-28 12:38 EDT, Dafna Ron
no flags Details

  None (edit)
Description Dafna Ron 2012-03-28 12:36:32 EDT
Description of problem:

I removed a vm that has 2 disks on 2 different storage domains while one of the domains was in maintenance. 
no error was aggregated by vdsm so backend removed the vm and image from db while in actuality the image remains on SD. 
as a result we have an orphaned object in the domain.

this only happens when the vm is desktop type. vdsm fails task when the vm is a server.  

Version-Release number of selected component (if applicable):

vdsm-4.9.6-4.5.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create two domains and attach them to a DC
2. add a vm (desktop not server) with two disks on each domain
3. put one domain in maintenance and remove vm 
  
Actual results:

bdsm does not fail the delete of image and backend completes the delete in db creating an orphan object in domain

Expected results:

we should fail the task so backend can rollback 

Additional info:logs will be attached

image ID: 6c1f8d8a-8f88-4372-9546-b7b65921279e
Comment 1 Dafna Ron 2012-03-28 12:38:31 EDT
Created attachment 573395 [details]
logs
Comment 2 Dafna Ron 2012-03-28 12:39:24 EDT
root@blond-vdsh ~]# lvs
  LV                                   VG                                   Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert
  ids                                  83a46a9e-dac2-4513-bb21-a33ff76a495a -wi----- 128.00m                                           
  inbox                                83a46a9e-dac2-4513-bb21-a33ff76a495a -wi----- 128.00m                                           
  leases                               83a46a9e-dac2-4513-bb21-a33ff76a495a -wi-----   2.00g                                           
  master                               83a46a9e-dac2-4513-bb21-a33ff76a495a -wi-ao--   1.00g                                           
  metadata                             83a46a9e-dac2-4513-bb21-a33ff76a495a -wi----- 512.00m                                           
  outbox                               83a46a9e-dac2-4513-bb21-a33ff76a495a -wi----- 128.00m                                           
  29af568c-e767-4217-9610-79e6c0e70602 91fd7b39-198c-4cb8-889e-837103b3c46c -wi-----   1.00g                                           
  6c1f8d8a-8f88-4372-9546-b7b65921279e 91fd7b39-198c-4cb8-889e-837103b3c46c -wi-----   2.00g                                           
  ids                                  91fd7b39-198c-4cb8-889e-837103b3c46c -wi----- 128.00m                                           
  inbox                                91fd7b39-198c-4cb8-889e-837103b3c46c -wi----- 128.00m                                           
  leases                               91fd7b39-198c-4cb8-889e-837103b3c46c -wi-----   2.00g                                           
  master                               91fd7b39-198c-4cb8-889e-837103b3c46c -wi-----   1.00g                                           
  metadata                             91fd7b39-198c-4cb8-889e-837103b3c46c -wi----- 512.00m                                           
  outbox                               91fd7b39-198c-4cb8-889e-837103b3c46c -wi----- 128.00m                                           
  lv_home                              vg0                                  -wi-ao--  38.86g                                           
  lv_root                              vg0                                  -wi-ao--  19.53g                                           
  lv_swap                              vg0                                  -wi-ao--  15.62g                                           
[root@blond-vdsh ~]# !less
Comment 3 RHEL Product and Program Management 2012-05-05 00:16:13 EDT
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Comment 4 Ayal Baron 2012-05-05 16:51:00 EDT
No error message returned because vdsm was never called..., why is this on vdsm?
VDSM has no notion of a VM, therefore when you delete one disk, vdsm cannot correlate it to another disk on another domain (which may be unavailable).
Comment 5 Dafna Ron 2012-05-06 04:17:49 EDT
image delete is sent and failed by vdsm, but you are right, it should not be on vdsm since the delete should be validated by backend. 

Thread-906::ERROR::2012-03-28 18:16:21,016::task::853::TaskManager.Task::(_setError) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1257, in deleteImage
    self.validateSdUUID(sdUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 208, in validateSdUUID
    sdCache.produce(sdUUID=sdUUID).validate()
  File "/usr/share/vdsm/storage/sdc.py", line 91, in produce
    dom = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 115, in _findDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('91fd7b39-198c-4cb8-889e-837103b3c46c',)
Thread-906::DEBUG::2012-03-28 18:16:21,017::task::872::TaskManager.Task::(_run) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::Task._run: a8d07180-6c96-47e6-8e96-b67e6f732736 ('91fd7b39-198c-4cb8-889e-837103b3c46c', 'aa9c0cc8-94b3-4546-9e3
5-0fd52d3454fd', '9a9606a1-e6e0-4a81-a8b9-56bdf01eccb0', 'false', 'false') {} failed - stopping task
Thread-906::DEBUG::2012-03-28 18:16:21,017::task::1199::TaskManager.Task::(stop) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::stopping in state preparing (force False)
Thread-906::DEBUG::2012-03-28 18:16:21,018::task::978::TaskManager.Task::(_decref) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::ref 1 aborting True
Thread-906::INFO::2012-03-28 18:16:21,019::task::1157::TaskManager.Task::(prepare) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::aborting: Task is aborted: 'Storage domain does not exist' - code 358
Comment 6 Dafna Ron 2012-05-06 04:21:02 EDT
image delete is sent and failed by vdsm, but you are right, it should not be on vdsm since the delete should be validated by backend. 

Thread-906::ERROR::2012-03-28 18:16:21,016::task::853::TaskManager.Task::(_setError) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1257, in deleteImage
    self.validateSdUUID(sdUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 208, in validateSdUUID
    sdCache.produce(sdUUID=sdUUID).validate()
  File "/usr/share/vdsm/storage/sdc.py", line 91, in produce
    dom = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 115, in _findDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('91fd7b39-198c-4cb8-889e-837103b3c46c',)
Thread-906::DEBUG::2012-03-28 18:16:21,017::task::872::TaskManager.Task::(_run) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::Task._run: a8d07180-6c96-47e6-8e96-b67e6f732736 ('91fd7b39-198c-4cb8-889e-837103b3c46c', 'aa9c0cc8-94b3-4546-9e3
5-0fd52d3454fd', '9a9606a1-e6e0-4a81-a8b9-56bdf01eccb0', 'false', 'false') {} failed - stopping task
Thread-906::DEBUG::2012-03-28 18:16:21,017::task::1199::TaskManager.Task::(stop) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::stopping in state preparing (force False)
Thread-906::DEBUG::2012-03-28 18:16:21,018::task::978::TaskManager.Task::(_decref) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::ref 1 aborting True
Thread-906::INFO::2012-03-28 18:16:21,019::task::1157::TaskManager.Task::(prepare) Task=`a8d07180-6c96-47e6-8e96-b67e6f732736`::aborting: Task is aborted: 'Storage domain does not exist' - code 358
Comment 8 Dafna Ron 2012-06-17 10:45:36 EDT
this is partially fixed. 
in preallocated disks we have an error from UI which prevents the command from being sent to vdsm

when the disks are thin provision the command is sent to vdsm. 
currently the vdsm is blocking the delete but its a race. 

please reproduce with thin provion disks 

Thin provision - error is coming from vdsm: 

2012-06-17 17:33:00,883 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-43) [2bfc812f] Failed in DeleteImageGroupVDS method
2012-06-17 17:33:00,883 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-43) [2bfc812f] Error code StorageDomainDoesNotExist and error message IRSGenericException: IRSErrorException: Failed to DeleteImag
eGroupVDS, error = Storage domain does not exist: ('d8c4e43a-e956-4c56-b9cb-76dc98e3ab9b',)

preallocated -> blocked with CanDoAction:  

2012-06-17 17:32:20,953 WARN  [org.ovirt.engine.core.bll.RemoveVmCommand] (ajp--0.0.0.0-8009-4) CanDoAction of action RemoveVm failed. Reasons:ACTION_TYPE_FAILED_STORAGE_DOMAIN_STATUS_ILLEGAL,VAR__ACTION__REMOVE,VAR__TYPE__VM
Comment 9 Tal Nisan 2012-06-20 09:52:43 EDT
Checked again on si6, also with thin provisioning disks, the command is blocked by the backend and not sent to VDSM
Comment 13 Dafna Ron 2012-07-01 11:28:59 EDT
verified on si8

Note You need to log in before you can comment on or make changes to this bug.