Bug 1444120 - After upgrade DC&cluster + Engine restart Reconstruct Master Domain for Data Center fails
Summary: After upgrade DC&cluster + Engine restart Reconstruct Master Domain for Data ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.1.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.3
: 4.1.3.3
Assignee: Benny Zlotnik
QA Contact: Avihai
URL:
Whiteboard:
Depends On:
Blocks: 1459224
TreeView+ depends on / blocked
 
Reported: 2017-04-20 15:30 UTC by Avihai
Modified: 2017-07-06 13:26 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1459224 (view as bug list)
Environment:
Last Closed: 2017-07-06 13:26:30 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)
engine & vdsm logs (2.29 MB, application/x-gzip)
2017-04-20 15:30 UTC, Avihai
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 77866 0 master MERGED core: change order of validations in RemoveStorageDomainCommand 2020-05-10 20:09:49 UTC
oVirt gerrit 77957 0 ovirt-engine-4.1 MERGED core: change order of validations in RemoveStorageDomainCommand 2020-05-10 20:09:49 UTC

Description Avihai 2017-04-20 15:30:44 UTC
Created attachment 1273021 [details]
engine & vdsm logs

Description of problem:
After upgrade DC+cluster 4.0->4.1 + Engine restart Reconstruct Master Domain for Data Center fails.


Version-Release number of selected component (if applicable):
Engine:
4.1.1.8

VDSM:
4.19.10.1

How reproducible:
Hard to reproduce , 1:5 or worse.

Steps to Reproduce:
Main automation Test flow :
1. Create DC + cluster on v3
2. Upgrade the cluster from v3 to v4
3. restart engine during upgrade cluster
4. Verify that cluster is upgraded to v4

Automation TEARDOWN :
5. after checking DC is in "active" state remove DC 

Actual results:
Engine restart Reconstruct Master Domain for Data Center fails.

Expected results:
Engine restart Reconstruct Master Domain for Data Center should succeed.

Additional info:
Time line from automation log:
2017-04-19 19:34:11,016 - MainThread - root - INFO - Update cluster cluster_upgrade_4_0_to_4_1 with {'compare': False, 'version': '4.1'}
2017-04-19 19:34:11,305 - MainThread - clusters - INFO - Using Correlation-Id: clusters_update_ada4c825-9dd9-41be
2017-04-19 19:34:11,756 - MainThrea2017-04-19 19:34:11,305 - MainThread - clusters - INFO - Using Correlation-Id: clusters_update_ada4c825-9dd9-41bed - clusters - INFO - cluster was updated
2017-04-19 19:34:29,311 - MainThread - Engine - INFO - [10.35.161.118] GET http://storage-ge-04.scl.lab.tlv.redhat.com/ovirt-engine/services/health
2017-04-19 19:34:29,668 - MainThread - Engine - ERROR - [10.35.161.118]   failed to get content of health page: HTTP Error 503: Service Unavailable
2017-04-19 19:34:50,287 - MainThread - Engine - ERROR - [10.35.161.118]   failed to get content of health page: HTTP Error 404: Not Found
2017-04-19 19:35:10,313 - MainThread - Engine - DEBUG - Skipping duplicate log-messages...
2017-04-19 19:35:15,887 - MainThread - Engine - INFO - [10.35.161.118]   CODE: 200, OK
2017-04-19 19:35:15,888 - MainThread - Engine - INFO - [10.35.161.118]   Content: DB Up!Welcome to Health Status!
2017-04-19 19:35:19,397 - MainThread - Engine - INFO - [10.35.161.118] GET http://storage-ge-04.scl.lab.tlv.redhat.com/ovirt-engine/services/health
2017-04-19 19:35:19,523 - MainThread - art.hl_lib.dcs - INFO - Wait for the Data center to become active
2017-04-19 19:35:20,817 - MainThread - datacenters - INFO - Waiting for query `name=dc_upgrade_4_0_to_4_1 and status=up` and event_id None up to 180 seconds,sampling every 10 second.
2017-04-19 19:35:21,077 - MainThread - root - INFO - Waits until SPM gets elected in DataCenter dc_upgrade_4_0_to_4_1 with {'sleep': 5, 'timeout': 300}
2017-04-19 19:35:21,086 - MainThread - root - INFO - Locate and return SPM or HSM host from specific datacenter dc_upgrade_4_0_to_4_1 with {'spm': True}
2017-04-19 19:35:22,159 - MainThread - root - INFO - Checking Storage Pool Manager (SPM) status of the host host_mixed_1.
2017-04-19 19:35:22,166 - MainThread - root - INFO - Get host object by host_name host_mixed_1. with {'attribute': 'name'}
2017-04-19 19:35:22,703 - MainThread - art.ll_lib.hosts - INFO - check_host_spm_status - SPM Status of host host_mixed_1 is: spm
2017-04-19 19:35:23,473 - MainThread - art.hl_lib.dcs - INFO - Waiting up to 120 seconds for sd upgrade_4_0_to_4_1glusterfs0 to be active
2017-04-19 19:35:24,061 - MainThread - root - INFO - Get Cluster cluster_upgrade_4_0_to_4_1 compatibility version
2017-04-19 19:35:24,311 - MainThread - art.logging - INFO - Status: passed
PASSED2017-04-19 19:35:24,313 - MainThread - art.logging - INFO - ================================================================================
2017-04-19 19:35:24,313 - MainThread - art.logging - INFO - TEARDOWN <TestCaseFunction 'test_restart_engine_during_upgrade_cluster'>
2017-04-19 19:35:24,314 - MainThread - pytest.art.matrix - INFO - The storage type reset to None
2017-04-19 19:35:26,708 - MainThread - rhevmtests.storage.helpers - INFO - Checking if domain upgrade_4_0_to_4_1glusterfs0 is active in dc dc_upgrade_4_0_to_4_1
2017-04-19 19:35:27,376 - MainThread - storagedomains - INFO - Domain upgrade_4_0_to_4_1glusterfs0 in dc dc_upgrade_4_0_to_4_1 is unknown
2017-04-19 19:35:27,376 - MainThread - art.ll_lib.dcs - INFO - Remove datacenter dc_upgrade_4_0_to_4_1
2017-04-19 19:35:27,643 - MainThread - datacenters - INFO - Using Correlation-Id: datacenters_delete_e74bc52a-1294-4256
2017-04-19 19:35:28,386 - MainThread - api_utils - ERROR - Failed to delete element NOT as expected:
	Status: 409
	Reason: Conflict
	Detail: [Cannot remove an active Data Center.]





From Engine:
2017-04-19 19:34:11,580+03 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (default task-4) [clusters_update_ada4c825-9dd9-41be] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}'
2017-04-19 19:34:11,695+03 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (default task-4) [clusters_update_ada4c825-9dd9-41be] Running command: UpdateClusterCommand internal: false. Entities affected :  ID: 15767467-bcab-418c-a31
4-78ebaf093a75 Type: ClusterAction group EDIT_CLUSTER_CONFIGURATION with role type ADMIN
2017-04-19 19:34:11,725+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [clusters_update_ada4c825-9dd9-41be] EVENT_ID: USER_UPDATE_CLUSTER(811), Correlation ID: clusters_update_ada4c825-9dd
9-41be, Call Stack: null, Custom Event ID: -1, Message: Host cluster cluster_upgrade_4_0_to_4_1 was updated by admin@internal-authz
2017-04-19 19:34:11,726+03 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (default task-4) [clusters_update_ada4c825-9dd9-41be] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}'
2017-04-19 19:34:12,265+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler9) [655f8748] FINISH, GetHostJobsVDSCommand, return: {}, log id: 15775af4
2017-04-19 19:34:12,273+03 INFO  [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler9) [655f8748] Command AmendVolume id: '36d67e16-2469-4785-b910-46c95adf4566': attempting to determine the job status by polling the ent
ity.
2017-04-19 19:34:12,276+03 ERROR [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler9) [655f8748] Command AmendVolume id: '36d67e16-2469-4785-b910-46c95adf4566': failed to poll the command entity
2017-04-19 19:34:12,276+03 INFO  [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler9) [655f8748] Command AmendVolume id: '36d67e16-2469-4785-b910-46c95adf4566': couldn't get the status of job 'b4f15b65-1695-452f-a5fe-8
2f3c11d7c19' on host 'host_mixed_1' (id: '0ba63f2d-06b2-4392-82ce-7abd60614b2a'), assuming it's still running
2017-04-19 19:34:15,914+03 ERROR [org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap] (DefaultQuartzScheduler2) [655f8748] Exception occured during response decomposition: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) [rt.jar:1.8.0_121]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) [rt.jar:1.8.0_121]
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) [rt.jar:1.8.0_121]
        at org.ovirt.vdsm.jsonrpc.client.internal.BatchCall.get(BatchCall.java:53) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.internal.Call.get(Call.java:48) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.internal.Call.get(Call.java:19) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.lazyEval(FutureMap.java:112) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.get(FutureMap.java:247) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VMInfoListReturn.<init>(VMInfoListReturn.java:15) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer.getAllVmStats(JsonRpcVdsServer.java:369) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand.executeVdsBrokerCommand(GetAllVmStatsVDSCommand.java:22) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:111) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandExecutor.execute(DefaultVdsCommandExecutor.java:14) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:407) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher.poll(VmsStatisticsFetcher.java:29) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.VmsListFetcher.fetch(VmsListFetcher.java:56) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher.poll(PollVmStatsRefresher.java:37) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor345.invoke(Unknown Source) [:1.8.0_121]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_121]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_121]
        at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77) [scheduler.jar:]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]





2017-04-19 19:35:40,998+03 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler1) [4dc8be4] IrsBroker::Failed::GetStoragePoolInfoVDS: IRSGenericException: IRSErrorException: IRSNoMasterDomainException
: Cannot find master domain: 'spUUID=e23a7d84-49b5-47c0-b5ea-0204648d1dd4, msdUUID=a7ba26e7-1ff4-437f-bd5e-690ef13a107d'
2017-04-19 19:35:41,171+03 INFO  [org.ovirt.engine.core.bll.storage.pool.ReconstructMasterDomainCommand] (org.ovirt.thread.pool-6-thread-22) [5c58f757] Running command: ReconstructMasterDomainCommand internal: true. Entities affected :  I
D: a7ba26e7-1ff4-437f-bd5e-690ef13a107d Type: Storage
2017-04-19 19:35:41,258+03 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (org.ovirt.thread.pool-6-thread-22) [5c58f757] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{runAsync='true', storagePoolId='e23a7d
84-49b5-47c0-b5ea-0204648d1dd4', ignoreFailoverLimit='false', vdsId='0ba63f2d-06b2-4392-82ce-7abd60614b2a', ignoreStopFailed='true'}), log id: 7148035d
2017-04-19 19:35:41,268+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (org.ovirt.thread.pool-6-thread-22) [5c58f757] START, SpmStopVDSCommand(HostName = host_mixed_1, SpmStopVDSCommandParameters:{runAsync='true', 
hostId='0ba63f2d-06b2-4392-82ce-7abd60614b2a', storagePoolId='e23a7d84-49b5-47c0-b5ea-0204648d1dd4'}), log id: 2c512330
2017-04-19 19:35:41,299+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (org.ovirt.thread.pool-6-thread-22) [5c58f757] SpmStopVDSCommand::Stopping SPM on vds 'host_mixed_1', pool id 'e23a7d84-49b5-47c0-b5ea-0204648d
1dd4'
2017-04-19 19:35:47,260+03 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler7) [655f8748] Command 'AmendImageGroupVolumes' (id: '59a83751-3637-48fd-844e-c284a95e9e85') waiting on child command 
id: '36d67e16-2469-4785-b910-46c95adf4566' type:'AmendVolume' to complete
2017-04-19 19:35:47,286+03 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler7) [7f18269b] Command 'AmendImageGroupVolumes' (id: 'f99e15c6-2099-43f3-a77b-528c782d58cb') waiting on child command 
id: '33d9d5ec-bf29-450b-8026-e102bd9fa859' type:'AmendVolume' to complete
2017-04-19 19:35:47,313+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler7) [7f18269b] START, GetHostJobsVDSCommand(HostName = host_mixed_1, GetHostJobsVDSCommandParameters:{runAsync='true'
, hostId='0ba63f2d-06b2-4392-82ce-7abd60614b2a', type='storage', jobIds='[a9c7a685-a4c7-4943-a66c-d52f71646020]'}), log id: 7941ea3c
2017-04-19 19:35:47,315+03 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] Unable to process messages Broken pipe
2017-04-19 19:35:47,327+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler9) [655f8748] Command 'GetAllVmStatsVDSCommand(HostName = host_mixed_1, VdsIdVDSCommandParametersBase:{runAsync='t
rue', hostId='0ba63f2d-06b2-4392-82ce-7abd60614b2a'})' execution failed: VDSGenericException: VDSNetworkException: Broken pipe
2017-04-19 19:35:47,339+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (DefaultQuartzScheduler9) [655f8748] Failed to fetch vms info for host 'host_mixed_1' - skipping VMs monitoring.
2017-04-19 19:35:47,349+03 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-6-thread-23) [655f8748] Host 'host_mixed_1' is not responding.
2017-04-19 19:35:47,369+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler7) [7f18269b] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Even
t ID: -1, Message: VDSM host_mixed_1 command GetHostJobsVDS failed: Broken pipe
2017-04-19 19:35:47,370+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler7) [7f18269b] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand' return value 'HostJobsReturn
:{status='Status [code=5022, message=Broken pipe]'}'
2017-04-19 19:35:47,370+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler7) [7f18269b] HostName = host_mixed_1
2017-04-19 19:35:47,374+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler7) [7f18269b] Command 'GetHostJobsVDSCommand(HostName = host_mixed_1, GetHostJobsVDSCommandParameters:{runAsync='tru
e', hostId='0ba63f2d-06b2-4392-82ce-7abd60614b2a', type='storage', jobIds='[a9c7a685-a4c7-4943-a66c-d52f71646020]'})' execution failed: VDSGenericException: VDSNetworkException: Broken pipe
2017-04-19 19:35:47,374+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler7) [7f18269b] FINISH, GetHostJobsVDSCommand, log id: 7941ea3c
2017-04-19 19:35:47,374+03 WARN  [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler7) [7f18269b] Command AmendVolume id: '33d9d5ec-bf29-450b-8026-e102bd9fa859': Failed to poll the job 'a9c7a685-a4c7-4943-a66c-d52f71646
020' on host 'host_mixed_1' (id: '0ba63f2d-06b2-4392-82ce-7abd60614b2a'), will retry soon
2017-04-19 19:35:47,377+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-22) [5c58f757] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, C
ustom Event ID: -1, Message: VDSM host_mixed_1 command SpmStopVDS failed: Broken pipe
2017-04-19 19:35:47,377+03 WARN  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (org.ovirt.thread.pool-6-thread-22) [5c58f757] Could not stop spm of pool 'e23a7d84-49b5-47c0-b5ea-0204648d1dd4' on vds '0ba63f2d-06b2-4392-82c
e-7abd60614b2a': VDSGenericException: VDSNetworkException: Broken pipe


VDSM:
2017-04-19 19:35:40,840+0300 ERROR (jsonrpc/3) [storage.Dispatcher] {'status': {'message': "Cannot find master domain: 'spUUID=e23a7d84-49b5-47c0-b5ea-0204648d1dd4, msdUUID=a7ba26e7-1ff4-437f-bd5e-690ef13a107d'", 'code': 304}} (dispatcher
:77)
2017-04-19 19:35:40,841+0300 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StoragePool.connect failed (error 304) in 0.36 seconds (__init__:515)
2017-04-19 19:35:41,167+0300 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 dd if=/rhev/data-center/e23a7d84-49b5-47c0-b5ea-0204648d1dd4/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None)
 (commands:69)
2017-04-19 19:35:41,182+0300 DEBUG (mailbox-spm) [storage.Misc.excCmd] FAILED: <err> = "dd: failed to open '/rhev/data-center/e23a7d84-49b5-47c0-b5ea-0204648d1dd4/mastersd/dom_md/inbox': No such file or directory\n"; <rc> = 1 (commands:93
)
2017-04-19 19:35:41,183+0300 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] Error checking for mail (storage_mailbox:788)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storage_mailbox.py", line 786, in run
    self._checkForMail()
  File "/usr/share/vdsm/storage/storage_mailbox.py", line 735, in _checkForMail
    "Could not read mailbox: %s" % self._inbox)

2017-04-19 19:35:41,369+0300 ERROR (jsonrpc/5) [root] Panic: Unrecoverable errors during SPM stop process. (panic:28)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sp.py", line 436, in stopSpm
    self.masterDomain.releaseClusterLock()
  File "/usr/share/vdsm/storage/sdc.py", line 50, in __getattr__
    return getattr(self.getRealDomain(), attrName)
  File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 153, in _findDomain
    return findMethod(sdUUID)
  File "/usr/share/vdsm/storage/glusterSD.py", line 55, in findDomain
    return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/glusterSD.py", line 51, in findDomainPath
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('a7ba26e7-1ff4-437f-bd5e-690ef13a107d',)

Comment 1 Avihai 2017-04-20 15:35:42 UTC
As discussed with Liron, bug opened for deeper investigation.

Issue occurred multiple times on different automation tests with same scenario (upgrade DC+cluster + engine restart ) on different storage domains ( iscsi,nfs,gluster)

Comment 3 Benny Zlotnik 2017-06-06 11:07:09 UTC
Note: this is unrelated to the cluster upgrade, the reconstruct happens because of the validation process of the storage domain removal. The actual issue is that reconstruction is triggered in the validation process when it shouldn't

Comment 4 Allon Mureinik 2017-06-20 09:58:59 UTC
This patch is contained in the ovirt-engine-4.1.3.3 tag.

Comment 5 Avihai 2017-06-21 08:01:15 UTC
verified at 4.1.3.4


Note You need to log in before you can comment on or make changes to this bug.