Bug 964649

Summary: vdsm: spmStop will be called because of 'TypeError: 'dictionary-keyiterator' object is unsubscriptable' error in vdsm when removing a template with depended vms from the export domain
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Yeela Kaplan <ykaplan>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: abaron, acathrow, amureini, bazulay, cboyle, cfergeau, dblechte, ebenahar, iheim, jkt, laravot, lpeer, oourfali, scohen, yeylon, ykaplan
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0Flags: amureini: Triaged+
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: v4.13.0 Doc Type: Bug Fix
Doc Text:
Previously, removing a template with dependent virtual machines failed because of a typo in the code. This has now been corrected, and fake parameters are also removed before the template is deleted.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 16:07:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-05-19 11:29:34 UTC
Created attachment 750025 [details]
logs

Description of problem:

when we remove a template with depended vms from the export domain we are getting the following error in vdsm:

Thread-4457::ERROR::2013-05-19 14:12:11,930::dispatcher::69::Storage.Dispatcher.Protect::(run) 'dictionary-keyiterator' object is unsubscriptable
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run
    result = ctask.prepare(self.func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 1159, in prepare
    raise self.error
TypeError: 'dictionary-keyiterator' object is unsubscriptable
Thread-4458::DEBUG::2013-05-19 14:12:11,957::BindingXMLRPC::161::vds::(wrapper) [10.35.161.49]
Thread-4458::DEBUG::2013-05-19 14:12:11,958::task::579::TaskManager.Task::(_updateState) Task=`2e5821b3-8634-41c4-847d-a9b80270152e`::moving from state init -> state preparing

this will cause engine to send spmStop. 

Version-Release number of selected component (if applicable):

sf17
vdsm-4.10.2-19.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. export a vm created from template + the template  to the export domain
2. delete the template
3.
  
Actual results:

the following error in vdsm will cause spmStop in engine and temlate will be reported as successfully removed: 

Thread-4457::ERROR::2013-05-19 14:12:11,930::dispatcher::69::Storage.Dispatcher.Protect::(run) 'dictionary-keyiterator' object is unsubscriptable
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run
    result = ctask.prepare(self.func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 1159, in prepare
    raise self.error
TypeError: 'dictionary-keyiterator' object is unsubscriptable



Expected results:

not sure why we have the error but we should not send spmStop if flow is to continue with the delete

Additional info: logs

Comment 1 Yeela Kaplan 2013-08-19 07:06:59 UTC
This bug is revealing a different bug in engine, which is the calling of spmStop if delete fails. 


Should there be opened another bug in engine component for this?

Comment 2 Christophe Fergeau 2013-08-19 10:40:38 UTC
Yeela, moving this bug to spice-vdagent-win seems like it was unintentional?

Comment 4 Allon Mureinik 2013-08-21 07:33:24 UTC
(In reply to Yeela Kaplan from comment #1)
> This bug is revealing a different bug in engine, which is the calling of
> spmStop if delete fails. 
> 
> 
> Should there be opened another bug in engine component for this?
Didn't you already handle something similar?

Comment 5 Liron Aravot 2013-08-21 08:13:40 UTC
When removing a template, the first call is to the vdsm verb "RemoveVM" which attempts to remove the template ovf from the export domain, therefore, even when failing to remove the disks the template won't appear anymore when listing the vm/templates in the export domain.

Spm failover process is being executed depending on the error code received from vdsm, in this case the returned error because of the bug was "GeneralException" and therefore the failover process ran. IMO the engine behaviour was fine in that case, if after fix there will be error codes that we would want to ignore when performing deleteImageGroup, it could be done - but that's a separate bug.

Comment 6 Allon Mureinik 2013-08-21 13:01:11 UTC
(In reply to Liron Aravot from comment #5)
> When removing a template, the first call is to the vdsm verb "RemoveVM"
> which attempts to remove the template ovf from the export domain, therefore,
> even when failing to remove the disks the template won't appear anymore when
> listing the vm/templates in the export domain.
> 
> Spm failover process is being executed depending on the error code received
> from vdsm, in this case the returned error because of the bug was
> "GeneralException" and therefore the failover process ran. IMO the engine
> behaviour was fine in that case, if after fix there will be error codes that
> we would want to ignore when performing deleteImageGroup, it could be done -
> but that's a separate bug.
Agreed - this is a non-issue, IMO.

Comment 7 Elad 2013-10-27 14:55:37 UTC
deleteImage is failing on vdsm when trying to remove a template, which has dependent VMs, from export domain.

### Failure in vdsm:

Thread-158357::ERROR::2013-10-27 10:50:25,678::fileSD::359::Storage.StorageDomain::(deleteImage) vol: /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_elad1/cfa261ae-19a6-47bb-b458-7ebb994d4027/images/_remoCDsYm1/a691fd35-a122-474a-aa40-4c9f22b7850b can't be removed.
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/fileSD.py", line 356, in deleteImage
    self.oop.os.remove(volPath + '.lease')
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 297, in callCrabRPCFunction
    *args, **kwargs)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 199, in callCrabRPCFunction
    raise err
OSError: [Errno 2] No such file or directory: '/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_elad1/cfa261ae-19a6-47bb-b458-7ebb994d4027/images/_remoCDsYm1/a691fd35-a122-474a-aa40-4c9f22b7850b.lease'


This failure is not causing to engine to start SPM election, so this problem is no longer existing.

Does the deleteImage suppose to fail?

Comment 8 Allon Mureinik 2013-11-27 18:39:15 UTC
Export domains do not have to be self-descriptive, so no, this is not supposed to fail, certainly not with a "no such error".

Yeela, can you please look into this?

Comment 9 Charlie 2013-11-28 00:31:16 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 10 Yeela Kaplan 2013-12-04 12:31:31 UTC
OSError: [Errno 2] No such file or directory: '/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_elad1/cfa261ae-19a6-47bb-b458-7ebb994d4027/images/_remoCDsYm1/a691fd35-a122-474a-aa40-4c9f22b7850b.lease'

deleteImage does not seem to fail (unless you have any other logs to show it). 
This is only the lease file that fails, and that is because there are no lease files in export domain, so it fails on 'no such file'.

Comment 11 Elad 2013-12-04 12:43:12 UTC
(In reply to Yeela Kaplan from comment #10)
> OSError: [Errno 2] No such file or directory:
> '/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_elad1/
> cfa261ae-19a6-47bb-b458-7ebb994d4027/images/_remoCDsYm1/a691fd35-a122-474a-
> aa40-4c9f22b7850b.lease'
> 
> deleteImage does not seem to fail (unless you have any other logs to show
> it). 
> This is only the lease file that fails, and that is because there are no
> lease files in export domain, so it fails on 'no such file'.

Thanks,
So regardless to this error, SPM election does not take place.

Verified with is20
vdsm-4.13.0-0.5.beta1.el6ev.x86_64.rpm
rhevm-3.3.0-0.28.beta1.el6ev.noarch.rpm

Comment 12 Charlie 2014-01-03 04:23:03 UTC
Hi there, is it possible to get some errata text for this of a quick overview on the cause and fix? Thanks.

Comment 13 errata-xmlrpc 2014-01-21 16:07:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html