Bug 1400622

Summary: Ovirt WebGUI Select as SPM action on a Non SPM host causes Unresponsive Data Center for ~10sec -Multiple [storage.StoragePool] Unhandled exception (utils:369) Errors seen in New SPM host
Product: [oVirt] ovirt-engine Reporter: Avihai <aefrat>
Component: BLL.StorageAssignee: Tal Nisan <tnisan>
Status: CLOSED DUPLICATE QA Contact: Raz Tamir <ratamir>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aefrat, amureini, bugs
Target Milestone: ovirt-4.2.0Flags: rule-engine: ovirt-4.2+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-01 16:25:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Engine&VDSM log files none

Description Avihai 2016-12-01 16:05:25 UTC
Created attachment 1226865 [details]
Engine&VDSM log files

Description of problem:
Ovirt WebGUI -> Select as SPM action on a Non SPM host causes Unresponsive Data Center for ~10sec .
At the same time Multiple [storage.StoragePool] Unhandled exception (utils:369) Errors seen in New SPM host


Version-Release number of selected component (if applicable):
4.1.0-0.0.master.20161201071307.gita5ff876.el7.centos

How reproducible:
100%


Steps to Reproduce:
1.Have 2 hosts (host1&2) which host1 is SPM and Host2 isn't on same DC&cluster

2.WebGUI -> Select as SPM action on a Non SPM host 

Event log entrie:
Dec 1, 2016 5:40:27 PM Host camel_vdsc was force selected by admin@internal-authz


Actual results:
Unresponsive Data Center (DC1) for ~10sec .
At the same time Multiple [storage.StoragePool] Unhandled exception (utils:369) Errors seen in New SPM host.

Event log entrie:
Dec 1, 2016 5:40:29 PM Invalid status on Data Center DC1. Setting status to Non Responsive.

After ~10sec DC becomes responsive and host2 become the new SPM .
Event log entrie:
Dec 1, 2016 5:40:36 PM Storage Pool Manager runs on Host camel_vdsc (Address: 10.35.116.3).


Expected results:
DC should not become unresponsive & Errors should not be seen


Additional info:
Errors from  vdsm_host2.log 
(the Non SPM host that was "Select as SPM " action have been done on him) :

2016-12-01 17:40:32,442 INFO  (upgrade/815f73c) [storage.StorageDomain] Resource namespace 02_vol_815f73c8-1184-4a00-8883-9ce328059e2b already registered (sd:667)
2016-12-01 17:40:32,444 ERROR (upgrade/815f73c) [storage.StoragePool] Unhandled exception (utils:369)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 366, in wrapper
    return f(*a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 180, in run
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 201, in _upgradePoolDomain
    self._finalizePoolUpgradeIfNeeded()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state
2016-12-01 17:40:32,451 INFO  (upgrade/f7cece4) [storage.StorageDomain] Resource namespace 01_img_f7cece4e-6dcb-4c59-8f05-bb65e42b05bb already registered (sd:658)
2016-12-01 17:40:32,451 INFO  (upgrade/f7cece4) [storage.StorageDomain] Resource namespace 02_vol_f7cece4e-6dcb-4c59-8f05-bb65e42b05bb already registered (sd:667)
2016-12-01 17:40:32,453 INFO  (upgrade/432a58c) [storage.StorageDomain] Resource namespace 01_img_432a58c9-e015-4c55-bcf7-b24705c29c3c already registered (sd:658)
2016-12-01 17:40:32,454 INFO  (upgrade/432a58c) [storage.StorageDomain] Resource namespace 02_vol_432a58c9-e015-4c55-bcf7-b24705c29c3c already registered (sd:667)
2016-12-01 17:40:32,455 ERROR (upgrade/432a58c) [storage.StoragePool] Unhandled exception (utils:369)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 366, in wrapper
    return f(*a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 180, in run
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 201, in _upgradePoolDomain
    self._finalizePoolUpgradeIfNeeded()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state
2016-12-01 17:40:32,456 ERROR (upgrade/f7cece4) [storage.StoragePool] Unhandled exception (utils:369)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 366, in wrapper
    return f(*a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 180, in run
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 201, in _upgradePoolDomain
    self._finalizePoolUpgradeIfNeeded()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state
2016-12-01 17:40:32,585 INFO  (upgrade/64f8d46) [storage.LVM] Refreshing lvs: vg=64f8d460-9430-4dc8-82b5-8b6146c1537b lvs=['ids'] (lvm:1224)
2016-12-01 17:40:32,660 INFO  (upgrade/64f8d46) [storage.LVM] Refreshing lvs: vg=64f8d460-9430-4dc8-82b5-8b6146c1537b lvs=['leases'] (lvm:1224)
2016-12-01 17:40:32,761 INFO  (upgrade/64f8d46) [storage.LVM] Refreshing lvs: vg=64f8d460-9430-4dc8-82b5-8b6146c1537b lvs=['metadata', 'leases', 'ids', 'inbox', 'outbox', 'master'] (lvm:1224)
2016-12-01 17:40:32,874 INFO  (upgrade/64f8d46) [storage.StorageDomain] Resource namespace 01_img_64f8d460-9430-4dc8-82b5-8b6146c1537b already registered (sd:658)
2016-12-01 17:40:32,874 INFO  (upgrade/64f8d46) [storage.StorageDomain] Resource namespace 02_vol_64f8d460-9430-4dc8-82b5-8b6146c1537b already registered (sd:667)
2016-12-01 17:40:32,874 INFO  (upgrade/64f8d46) [storage.StorageDomain] Resource namespace 03_lvm_64f8d460-9430-4dc8-82b5-8b6146c1537b already registered (blockSD:874)
2016-12-01 17:40:32,937 ERROR (upgrade/64f8d46) [storage.StoragePool] Unhandled exception (utils:369)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 366, in wrapper
    return f(*a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 180, in run
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 201, in _upgradePoolDomain
    self._finalizePoolUpgradeIfNeeded()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
:

Comment 1 Allon Mureinik 2016-12-01 16:21:30 UTC
By definition, if there's no SPM, the DC should become unresponsive until a new on is selected. Reducing severity for now. Not sure if there's anything better we can do, but we'll look into it.

Comment 2 Raz Tamir 2016-12-01 16:25:04 UTC

*** This bug has been marked as a duplicate of bug 1400534 ***