Bug 1054108

Summary: engine: DeactivateStorageDomainCommand fails with vdsm error: 'Operation not allowed while SPM is active' because we do not actually send SpmStop while there are unknown tasks
Product: Red Hat Enterprise Virtualization Manager Reporter: Tomas Dosek <tdosek>
Component: ovirt-engineAssignee: Liron Aravot <laravot>
Status: CLOSED ERRATA QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: high    
Version: 3.2.0CC: acanan, acathrow, adahms, amureini, dron, iheim, jkt, laravot, lnatapov, lpeer, oourfali, ratamir, Rhev-m-bugs, scohen, yeylon, zdover
Target Milestone: ---Keywords: ZStream
Target Release: 3.3.3   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: org.ovirt.engine-root-3.3.0-51 Doc Type: Bug Fix
Doc Text:
Previously, attempts to deactivate the master storage domain in Red Hat Enterprise Virtualization environments in which there were no other domains to which to migrate resulted in an error under certain conditions. The logic used to deactivate storage domains failed to run the SpmStop action on the storage pool manager when zombie tasks were still present. Now, this logic has been revised so that the master storage domain is deactivated only after the storage pool manager has been stopped successfully.
Story Points: ---
Clone Of: 921666 Environment:
Last Closed: 2014-05-27 09:07:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 921666, 1058022    
Bug Blocks:    

Comment 1 Tomas Dosek 2014-01-16 09:21:21 UTC
Reproduced on:
rhev-guest-tools-iso-3.2-15.noarch                          Tue 22 Oct 2013 07:25:34 PM CEST
rhevm-3.2.3-0.43.el6ev.noarch                               Tue 22 Oct 2013 07:25:56 PM CEST
rhevm-backend-3.2.3-0.43.el6ev.noarch                       Tue 22 Oct 2013 07:25:48 PM CEST
rhevm-cli-3.2.0.9-1.el6ev.noarch                            Tue 22 Oct 2013 07:25:21 PM CEST
rhevm-config-3.2.3-0.43.el6ev.noarch                        Tue 22 Oct 2013 07:25:37 PM CEST
rhevm-dbscripts-3.2.3-0.43.el6ev.noarch                     Tue 22 Oct 2013 07:25:37 PM CEST
rhevm-doc-3.2.1-2.el6eng.noarch                             Tue 22 Oct 2013 07:25:26 PM CEST
rhevm-dwh-3.2.1-2.el6ev.noarch                              Tue 22 Oct 2013 07:34:51 PM CEST
rhevm-genericapi-3.2.3-0.43.el6ev.noarch                    Tue 22 Oct 2013 07:25:34 PM CEST
rhevm-image-uploader-3.2.2-2.el6ev.noarch                   Tue 22 Oct 2013 07:25:22 PM CEST
rhevm-iso-uploader-3.2.2-3.el6ev.noarch                     Tue 22 Oct 2013 07:25:22 PM CEST
rhevm-log-collector-3.2.2-4.el6ev.noarch                    Tue 22 Oct 2013 07:25:22 PM CEST
rhevm-notification-service-3.2.3-0.43.el6ev.noarch          Tue 22 Oct 2013 07:25:47 PM CEST
rhevm-reports-3.2.1-6.el6ev.noarch                          Tue 22 Oct 2013 07:34:53 PM CEST
rhevm-restapi-3.2.3-0.43.el6ev.noarch                       Tue 22 Oct 2013 07:25:34 PM CEST
rhevm-sdk-3.2.1.1-1.el6ev.noarch                            Tue 22 Oct 2013 07:25:20 PM CEST
rhevm-setup-3.2.3-0.43.el6ev.noarch                         Tue 22 Oct 2013 07:22:32 PM CEST
rhevm-spice-client-x64-cab-3.2-13.el6ev.noarch              Tue 22 Oct 2013 07:25:48 PM CEST
rhevm-spice-client-x86-cab-3.2-13.el6ev.noarch              Tue 22 Oct 2013 07:25:55 PM CEST
rhevm-tools-common-3.2.3-0.43.el6ev.noarch                  Tue 22 Oct 2013 07:25:37 PM CEST
rhevm-userportal-3.2.3-0.43.el6ev.noarch                    Tue 22 Oct 2013 07:25:55 PM CEST
rhevm-webadmin-portal-3.2.3-0.43.el6ev.noarch               Tue 22 Oct 2013 07:25:46 PM CEST

Thread-178596::ERROR::2014-01-12 17:06:52,756::task::850::TaskManager.Task::(_setError) Task=`ef41e8b9-449c-4d9f-9c10-d2e6f38a6bcf`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 41, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1027, in disconnectStoragePool
    self.validateNotSPM(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 310, in validateNotSPM
    raise se.IsSpm(spUUID)
IsSpm: Operation not allowed while SPM is active: ('78074c33-a4a0-44bb-b70b-7a8beada0ff5',)

Attaching relevant engine and vdsm logs in a sec.

Comment 4 Tomas Dosek 2014-01-16 09:28:44 UTC
Reproducer scenario seems to be:

Have an SPM host with tasks in unknown state.
Put the host to maintenance.
Upgrade vdsm and try to activate the host.

Comment 7 Liron Aravot 2014-03-02 07:50:05 UTC
Hi Tomas, Oved
this is merged currently for 3.4,
adding needinfo? on Allon to clarify if we want it also for 3.3 as there's no devel ack at the moment.

Comment 8 Allon Mureinik 2014-03-11 10:13:10 UTC
Pending qa-ack (Aharon - please ack/nack it), yes, we want this backported.

Comment 9 Aharon Canan 2014-03-11 16:02:33 UTC
Allon, back to you following our discussion,

Comment 10 Allon Mureinik 2014-03-11 16:05:40 UTC
My mistake. I missed this was ALREADY a clone.

Aharon - please ack/nack this for 3.3.3.
Liron - if this is acked, we need a backport.

Comment 11 Allon Mureinik 2014-03-16 13:30:34 UTC
(In reply to Allon Mureinik from comment #10)
> My mistake. I missed this was ALREADY a clone.
> 
> Aharon - please ack/nack this for 3.3.3.
> Liron - if this is acked, we need a backport.
qa-ack was given. Liron - please handle backports to ovirt-engine-3.4 and ovirt-engine-3.3

Comment 13 Raz Tamir 2014-05-12 12:43:54 UTC
Verified:
vdsm-4.13.2-0.14.el6ev.x86_64
rhevm-3.3.3-0.51.el6ev.noarch

Create zombie task:
1. On iscsi storage with 200GB create 175GB disk (async task)
2. Delete the task from async_tasks table from db (task become zombie)
3. Put master domain in maintenance

Comment 15 errata-xmlrpc 2014-05-27 09:07:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0547.html