Bug 917363

Summary: vdsm: can't remove/export a vm with exception on getAllVolumes
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Eduardo Warszawski <ewarszaw>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.2.0CC: abaron, bazulay, hateya, iheim, italkohe, lpeer, lyarwood, pzhukov, rhodain, scohen, sgrinber, ykaul
Target Milestone: ---Keywords: Regression, ZStream
Target Release: 3.2.0Flags: scohen: Triaged+
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
"--no tech note required"
Story Points: ---
Clone Of:
: 949690 (view as bug list) Environment:
Last Closed: 2013-06-10 20:42:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 949690    
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-03-03 15:10:12 UTC
Created attachment 704571 [details]
logs

Description of problem:

I can't remove or export vm's that I am creating in iscsi storage on sf9 with  vdsm-4.10.2-10.0.el6ev.x86_64

Version-Release number of selected component (if applicable):

sf9
vdsm-4.10.2-10.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1.in iscsi storage, create a vm 
2. try to export the vm
3. try to remove the vm
  
Actual results:

moveImage and deleteImage both fail with the exception


Expected results:

we should be able to remove/move disk

Additional info:logs

Thread-2006::ERROR::2013-03-03 05:01:19,419::task::833::TaskManager.Task::(_setError) Task=`d543c6ff-1ea3-400b-b005-2e8deea47531`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 840, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1427, in deleteImage
    allVols = dom.getAllVolumes()
  File "/usr/share/vdsm/storage/blockSD.py", line 974, in getAllVolumes
    return getAllVolumes(self.sdUUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 169, in getAllVolumes
    and vImg not in res[vPar]['imgs']:
KeyError: '4711bdc0-b974-4791-9c38-68d01a58bec5'


Thread-708::ERROR::2013-03-03 04:12:49,606::task::833::TaskManager.Task::(_setError) Task=`7a6b4d68-a1eb-4207-a021-6e2c358d6d74`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 840, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1528, in moveImage
    self.validateImageMove(srcDom, dstDom, imgUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 1483, in validateImageMove
    srcAllVols = srcDom.getAllVolumes()
  File "/usr/share/vdsm/storage/blockSD.py", line 974, in getAllVolumes
    return getAllVolumes(self.sdUUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 169, in getAllVolumes
    and vImg not in res[vPar]['imgs']:
KeyError: '4711bdc0-b974-4791-9c38-68d01a58bec5'
Thread-708::DEBUG::2013-03-03 04:12:49,607::task::852::TaskManager.Task::(_run) Task=`7a6b4d68-a1eb-4207-a021-6e2c358d6d74`::Task._run: 7a6b4d68-a1eb-4207-a021-6e2c358d6d74 ('c932bf64-4642-413a-a70c-2fa3f5e40b85', '86f46c27-9700-4626-9b
89-1c34b3b4d7b5', 'db60351e-b8b1-4c51-bb0f-b8f8ad016e77', '0fade626-c08a-45a7-85bf-d3033f9c6f85', '', 1, 'false', 'false') {} failed - stopping task
Thread-708::DEBUG::2013-03-03 04:12:49,608::task::1177::TaskManager.Task::(stop) Task=`7a6b4d68-a1eb-4207-a021-6e2c358d6d74`::stopping in state preparin

Comment 4 Ayal Baron 2013-03-03 19:13:03 UTC
Edu, is this a duplicate of a bug already assigned to you?

Comment 6 Eduardo Warszawski 2013-03-04 07:49:05 UTC
(In reply to comment #4)
> Edu, is this a duplicate of a bug already assigned to you?
Still investigating the root cause.

getAllVolumes() exception is the _symptom_ of a domain containing broken images.

getAllVolumes() was designed to be strict and deleteImage() function tries to avoid deleting images from damaged SD, since is the SD layout what reveals the layout.

Deleting images from such domain can produce data loss.

If the orphan volumes are product of old broken code or manually made is better to fix the SD manually, removing such volume.

Anyway this can be addressed with:
Ib8514236a5d4793f66709e9daf546fb46047414f

As said in the comment I'm not sure that is the right thing to do.

Comment 14 Leonid Natapov 2013-04-23 14:11:25 UTC
tested with vdsm-4.10.2-15.0.el6ev.x86_64. can remove and export VM when having orphan images. orphan images were created by live moving multiple disks (about 100 disks). Also "illegal" imeages were created for the same test.

Comment 17 errata-xmlrpc 2013-06-10 20:42:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0886.html