Bug 1082365 - attribute error when executing fenceSpmStorage
Summary: attribute error when executing fenceSpmStorage
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.4
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 3.4.1
Assignee: Liron Aravot
QA Contact: Aharon Canan
URL:
Whiteboard: storage
: 1103165 (view as bug list)
Depends On: 1092631
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-30 20:05 UTC by md
Modified: 2016-02-10 18:49 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1092631 (view as bug list)
Environment:
Last Closed: 2014-05-08 13:38:25 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
log when Node1 was powerded off (167.18 KB, text/plain)
2014-04-02 07:09 UTC, md
no flags Details
log when Node1 was powerded off (483.77 KB, text/plain)
2014-04-02 07:11 UTC, md
no flags Details
log when Node1 is back again (215.20 KB, text/plain)
2014-04-02 07:13 UTC, md
no flags Details
log when Node1 is back again (999.75 KB, text/plain)
2014-04-02 07:14 UTC, md
no flags Details
log when Node1 is back again (713.38 KB, text/plain)
2014-04-02 07:15 UTC, md
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 26358 0 None MERGED core: fixing attribute error on fenceSpmStorage Never
oVirt gerrit 27226 0 None None None Never
oVirt gerrit 27340 0 None MERGED core: return lver/spm id from pool metadata Never
oVirt gerrit 27341 0 None MERGED core: fixing attribute error on fenceSpmStorage Never

Description md 2014-03-30 20:05:06 UTC
Description of problem:

- 2 Node Cluster
- When the SPM Node fails, the Datacenter goes down and no new SPM will be elected
- When fencing takes place and was successfull, the Message is:

Manual fence did not revoke the selected SPM (Node1) since the master storage domain was not active or could not use another host for the fence operation.

- Manual Fencing the Host makes no difference
- I found no Solution to switch the SPM to a working Node, which means the Datacenter is unusable until the original SPM is back online

Version-Release number of selected component (if applicable):

- 3.4.0

How reproducible:


Steps to Reproduce:

1.Power off SPM Node
2. Prevent the Host from coming up again after fencing took place

Actual results:

- Datacenter is down, SPM is Non Responsive
- When trying to manual switch the SPM, the Message is:

Error while executing action: Cannot force select SPM: Storage Domain cannot be accessed.
-Please check that at least one Host is operational and Data Center state is up.


Expected results:

- A new SPM Node is elected

Comment 1 md 2014-03-30 20:07:39 UTC
2014-03-30 22:07:43,323 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-22) hostFromVds::selectedVds - Node2, spmStatus Free, storage pool Default
2014-03-30 22:07:43,325 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-22) SPM Init: could not find reported vds or not up - pool:Default vds_spm_id: 1
2014-03-30 22:07:43,350 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-22) SPM selection - vds seems as spm Node1
2014-03-30 22:07:43,351 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-22) spm vds is non responsive, stopping spm selection.
2014-03-30 22:07:43,714 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-78) Command GetCapabilitiesVDSCommand(HostName = Node1, HostId = ff474b41-22c5-440e-8052-4cf40c27b250, vds=Host[Node1]) execution failed. Exception: VDSNetworkException: java.net.SocketTimeoutException: connect timed out
2014-03-30 22:07:48,848 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-26) Command GetCapabilitiesVDSCommand(HostName = Node1, HostId = ff474b41-22c5-440e-8052-4cf40c27b250, vds=Host[Node1]) execution failed. Exception: VDSNetworkException: java.net.SocketTimeoutException: connect timed out
2014-03-30 22:07:53,396 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-71) [4b275d81] hostFromVds::selectedVds - Node2, spmStatus Free, storage pool Default
2014-03-30 22:07:53,398 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-71) [4b275d81] SPM Init: could not find reported vds or not up - pool:Default vds_spm_id: 1
2014-03-30 22:07:53,424 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-71) [4b275d81] SPM selection - vds seems as spm Node1
2014-03-30 22:07:53,425 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-71) [4b275d81] spm vds is non responsive, stopping spm selection.
2014-03-30 22:07:53,973 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-37) Command GetCapabilitiesVDSCommand(HostName = Node1, HostId = ff474b41-22c5-440e-8052-4cf40c27b250, vds=Host[Node1]) execution failed. Exception: VDSNetworkException: java.net.SocketTimeoutException: connect timed out
2014-03-30 22:07:59,072 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-49) Command GetCapabilitiesVDSCommand(HostName = Node1, HostId = ff474b41-22c5-440e-8052-4cf40c27b250, vds=Host[Node1]) execution failed. Exception: VDSNetworkException: java.net.SocketTimeoutException: connect timed out
2014-03-30 22:08:03,472 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-35) hostFromVds::selectedVds - Node2, spmStatus Free, storage pool Default
2014-03-30 22:08:03,477 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-35) SPM Init: could not find reported vds or not up - pool:Default vds_spm_id: 1
2014-03-30 22:08:03,502 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-35) SPM selection - vds seems as spm Node1
2014-03-30 22:08:03,503 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-35) spm vds is non responsive, stopping spm selection.

Comment 2 Allon Mureinik 2014-04-01 09:28:16 UTC
Liron, please take a look at this?

Comment 3 Allon Mureinik 2014-04-01 09:29:21 UTC
Thanks for reporting this issue!

Could you please attach the full engine and vdsm logs?

Comment 4 Liron Aravot 2014-04-01 14:05:45 UTC
Sure, waiting for the logs as the current paste is not enough for performing RCA of the issue.

Comment 5 md 2014-04-01 17:07:44 UTC
Thanks for replying, I will upload the logs tomorrow as I have no access to the System today.

Comment 6 md 2014-04-02 07:09:09 UTC
Created attachment 881647 [details]
log when Node1 was powerded off

Comment 7 md 2014-04-02 07:11:53 UTC
Created attachment 881648 [details]
log when Node1 was powerded off

Comment 8 md 2014-04-02 07:13:53 UTC
Created attachment 881649 [details]
log when Node1 is back again

Comment 9 md 2014-04-02 07:14:53 UTC
Created attachment 881650 [details]
log when Node1 is back again

Comment 10 md 2014-04-02 07:15:31 UTC
Created attachment 881651 [details]
log when Node1 is back again

Comment 11 Liron Aravot 2014-04-02 09:02:44 UTC
Hi,
the issue here is an attributeError, caused by executing a method on the wrong objects.

Thread-43::INFO::2014-04-02 08:57:34,510::logUtils::44::dispatcher::(wrapper) Run and protect: fenceSpmStorage(spUUID='00000002-0002-0002-0002-0000000000ea', 
lastOwner=None, lastLver=None, options=None)
Thread-43::ERROR::2014-04-02 08:57:34,511::task::866::TaskManager.Task::(_setError) Task=`6f4cb2fc-816c-41f1-bcf5-272396b40b40`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3546, in fenceSpmStorage
    pool.invalidateMetadata()
AttributeError: 'StoragePool' object has no attribute 'invalidateMetadata'

will add a fix soon

Comment 12 Sandro Bonazzola 2014-05-08 13:38:25 UTC
This is an automated message

oVirt 3.4.1 has been released:
 * should fix your issue
 * should be available at your local mirror within two days.

If problems still persist, please make note of it in this bug report.

Comment 13 Liron Aravot 2014-06-02 10:55:38 UTC
*** Bug 1103165 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.