Bug 1417456
Summary: | Disk move stuck after vdsmd restart on SPM during disk move between storage domains | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Avihai <aefrat> | ||||
Component: | BLL.Storage | Assignee: | Liron Aravot <laravot> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Raz Tamir <ratamir> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.1.0.2 | CC: | bazulay, bugs, gklein, laravot, lsurette, srevivo, tnisan, ycui, ykaul | ||||
Target Milestone: | ovirt-4.1.0-rc | Flags: | rule-engine:
ovirt-4.1?
|
||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-01-30 13:37:04 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Might be related to new HSM infrastructure on cold move disk Liron, seems to me like a duplicate of 1415502, isn't it? Tal, Indeed. *** This bug has been marked as a duplicate of bug 1415502 *** |
Created attachment 1245582 [details] engine & vdsm logs Description of problem: Disk move stuck after vdsmd restart on SPM during disk move between storage domains. Version-Release number of selected component (if applicable): Engine = ovirt-engine-4.1.0.2-0.2.el7.noarch vdsm = 4.19.2-2 How reproducible: Happened once on 10G disk , did not happen in smaller size disk as 5G . Steps to Reproduce: 1. Create a 10G preallocated disk 2. Move disk to a different storage domain I moved disk between 2 iscsi storage domains , source = iscsi_2 target = iscsi_1 3. Restart vdsmd on SPM host (host_mixed_2) 4. New SPM is selected (host_mixed_3) 5. Check the disk status . Actual results: Disk is stuck in "LOCKED" status forever (1H+) without any change ,shows 19% progress. Expected results: Action should be rolled back and disk should not be copied & available on source storage domain after new SPM host is up Additional info: From new SPM host vdsm.log: 2017-01-29 15:30:39,015 ERROR (upgrade/c9d819f) [storage.StoragePool] Unhandled exception (utils:371) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 368, in wrapper return f(*a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 180, in run return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 232, in _upgradePoolDomain self._finalizePoolUpgradeIfNeeded() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") Timeline (last event first) : Jan 29, 2017 3:31:03 PM Status of host host_mixed_2 was set to Up. Jan 29, 2017 3:30:57 PM VDSM host_mixed_2 command GetCapabilitiesVDS failed: Client close Jan 29, 2017 3:30:39 PM Storage Pool Manager runs on Host host_mixed_3 (Address: storage-ge4-vdsm3.qa.lab.tlv.redhat.com). Jan 29, 2017 3:30:38 PM VDSM host_mixed_2 command HSMGetAllTasksStatusesVDS failed: Not SPM: () Jan 29, 2017 3:30:38 PM Invalid status on Data Center golden_env_mixed. Setting status to Non Responsive. Jan 29, 2017 3:30:34 PM Host host_mixed_2 is not responding. Host cannot be fenced automatically because power management for the host is disabled. Jan 29, 2017 3:30:33 PM Jan 29, 2017 3:30:18 PM User admin@internal-authz moving disk preallocated_disk to domain iscsi_1. Jan 29, 2017 3:30:16 PM The disk 'preallocated_disk' was successfully added.