Bug 979753
Summary: | [vdsm] LV metadata corruption after failed LSM and snapshots merge | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Elad <ebenahar> | ||||
Component: | vdsm | Assignee: | Federico Simoncelli <fsimonce> | ||||
Status: | CLOSED ERRATA | QA Contact: | Leonid Natapov <lnatapov> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.2.0 | CC: | abaron, acanan, amureini, bazulay, ebenahar, fsimonce, iheim, jkt, lpeer, scohen, yeylon | ||||
Target Milestone: | --- | Flags: | abaron:
Triaged+
|
||||
Target Release: | 3.3.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
After a failed live storage migration, attempting to merge the snapshots caused logical volume metadata corruption. This problem has been fixed, so the corruption no longer occurs, and the disks can be moved after an unsuccessful live migration.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-01-21 16:26:38 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 957703 | ||||||
Bug Blocks: | 1019461 | ||||||
Attachments: |
|
Description
Elad
2013-06-30 08:53:52 UTC
Created attachment 767025 [details]
logs and lv metadata
Does vdsm restart solve the problem? (In reply to Ayal Baron from comment #2) > Does vdsm restart solve the problem? I didn't try to do so. Those disks are no longer on the system. I'll try to reproduce it on another 3.2 env. (In reply to Elad from comment #3) > (In reply to Ayal Baron from comment #2) > > Does vdsm restart solve the problem? > > I didn't try to do so. Those disks are no longer on the system. > I'll try to reproduce it on another 3.2 env. I'm having trouble reproduce it, so it's unclear whether vdsm restart solve this problem or not These logs have so many errors, it's hard to know where to begin. The problem happened here on multiple images but the first one is missing info so tracking flow of live storage migration for image: a79175bf-08e8-4854-8acb-74d21a65366a: Thread-278447::INFO::2013-06-19 16:04:53,200::logUtils::40::dispatcher::(wrapper) Run and protect: createVolume(sdUUID='ebe26784-f182-430f-9422-ddc162ca064c', spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', imgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', size='12884901888', volFormat=4, preallocate=2, diskType=2, volUUID='6df1ac7e-1992-4789-b3c3-de7a7f1d984f', desc='', srcImgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', srcVolUUID='51a6795a-e26f-4a5c-a0db-635180539bbf') 7667b392-6998-4cdd-b4ab-c41006d72347::DEBUG::2013-06-19 16:05:34,934::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a%3514f0c542a000184|36006016012823000787411ed01d9e211|36006016012823000daf89aa8c7d8e211%\', \'r%.*%\' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --autobackup n --available n ebe26784-f182-430f-9422-ddc162ca064c/51a6795a-e26f-4a5c-a0db-635180539bbf' (cwd None) 7667b392-6998-4cdd-b4ab-c41006d72347::DEBUG::2013-06-19 16:05:40,340::misc::83::Storage.Misc.excCmd::(<lambda>) FAILED: <err> = ' /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 2199023247360: Input/output error\n /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 0: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/mapper/36006016012823000daf89aa8c7d8e211 was disabled\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n device-mapper: remove ioctl on failed: Device or resource busy\n Unable to deactivate ebe26784--f182--430f--9422--ddc162ca064c-51a6795a--e26f--4a5c--a0db--635180539bbf (253:59)\n'; <rc> = 5 (lvchange failed for several volumes...) Thread-278504::DEBUG::2013-06-19 16:06:13,017::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmSnapshot with ('b18d74ad-b087-441c-9e9d-2a1112927a7e', [{'baseVolumeID': '51a6795a-e26f-4a5c-a0db-635180539bbf', 'domainID': 'ebe26784-f182-430f-9422-ddc162ca064c', 'volumeID': '6df1ac7e-1992-4789-b3c3-de7a7f1d984f', 'imageID': 'a79175bf-08e8-4854-8acb-74d21a65366a'}]) {} Thread-278504::DEBUG::2013-06-19 16:06:24,488::BindingXMLRPC::920::vds::(wrapper) return vmSnapshot with {'status': {'message': 'Done', 'code': 0}, 'quiesce': False} Thread-278512::INFO::2013-06-19 16:06:24,749::logUtils::40::dispatcher::(wrapper) Run and protect: getVolumeInfo(sdUUID='ebe26784-f182-430f-9422-ddc162ca064c', spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', imgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', volUUID='6df1ac7e-1992-4789-b3c3-de7a7f1d984f', options=None) Thread-278512::INFO::2013-06-19 16:06:26,087::logUtils::42::dispatcher::(wrapper) Run and protect: getVolumeInfo, Return response: {'info': {'status': 'OK', 'domain': 'ebe26784-f182-430f-9422-ddc162ca064c', 'voltype': 'LEAF', 'description': '', 'parent': '51a6795a-e26f-4a5c-a0db-635180539bbf', 'format': 'COW', 'image': 'a79175bf-08e8-4854-8acb-74d21a65366a', 'ctime': '1371654371', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1371654371', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '12884901888', 'uuid': '6df1ac7e-1992-4789-b3c3-de7a7f1d984f', 'truesize': '1073741824', 'type': 'SPARSE'}} Thread-278513::INFO::2013-06-19 16:06:26,392::logUtils::40::dispatcher::(wrapper) Run and protect: cloneImageStructure(spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', sdUUID='ebe26784-f182-430f-9422-ddc162ca064c', imgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', dstSdUUID='ae744a61-0e73-480e-a113-14649d0e8e73') Thread-278513::INFO::2013-06-19 16:06:28,560::logUtils::42::dispatcher::(wrapper) Run and protect: cloneImageStructure, Return response: None (clone task finished successfully) Thread-278564::DEBUG::2013-06-19 16:07:36,412::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmDiskReplicateStart with ('b18d74ad-b087-441c-9e9d-2a1112927a7e', {'device': 'disk', 'domainID': 'ebe26784-f182-430f-9422-ddc162ca064c', 'volumeID': '6df1ac7e-1992-4789-b3c3-de7a7f1d984f', 'poolID': 'ba63f0c8-37ca-492b-b28c-55ae09c92a6a', 'imageID': 'a79175bf-08e8-4854-8acb-74d21a65366a'}, {'device': 'disk', 'domainID': 'ae744a61-0e73-480e-a113-14649d0e8e73', 'volumeID': '6df1ac7e-1992-4789-b3c3-de7a7f1d984f', 'poolID': 'ba63f0c8-37ca-492b-b28c-55ae09c92a6a', 'imageID': 'a79175bf-08e8-4854-8acb-74d21a65366a'}) {} flowID [db2f067] Thread-278566::INFO::2013-06-19 16:07:38,258::logUtils::40::dispatcher::(wrapper) Run and protect: syncImageData(spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', sdUUID='ebe26784-f182-430f-9422-ddc162ca064c', imgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', dstSdUUID='ae744a61-0e73-480e-a113-14649d0e8e73', syncType='INTERNAL') Thread-278570::DEBUG::2013-06-19 16:07:44,062::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmGetStats with ('b18d74ad-b087-441c-9e9d-2a1112927a7e',) {}flowID [79550d4e] Thread-278570::DEBUG::2013-06-19 16:07:44,088::BindingXMLRPC::920::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'username': u'None', 'memUsage': '55', 'acpiEnable': 'true', 'pid': '6879', 'displayPort': u'5902', 'session': 'Unknown', 'displaySecurePort': u'5903', 'timeOffset': 3601L, 'memoryStats': {u'swap_out': '0', u'majflt': '0', u'mem_free': '1399628', u'swap_in': '0', u'pageflt': '9', u'mem_total': '2956800', u'mem_unused': '377788'}, 'balloonInfo': {'balloon_max': 3145728, 'balloon_cur': 3145728}, 'pauseCode': 'NOERR', 'disksUsage': [{u'path': u'/', u'total': '11418218496', u'fs': u'ext3', u'used': '3419381760'}, {u'path': u'/proc/bus/usb', u'total': '0', u'fs': u'usbfs', u'used': '0'}, {u'path': u'/boot', u'total': '203097088', u'fs': u'ext3', u'used': '32711680'}, {u'path': u'/var/lib/jenkins/workspace', u'total': '1057951318016', u'fs': u'nfs', u'used': '1000840626176'}], 'network': {u'vnet1': {'macAddr': '00:1a:4a:16:97:11', 'rxDropped': '0', 'rxErrors': '0', 'txDropped': '0', 'txRate': '0.0', 'rxRate': '0.0', 'txErrors': '0', 'state': 'unknown', 'speed': '1000', 'name': u'vnet1'}}, 'displayType': 'qxl', 'guestName': u'jenkins-vm-06.scl.lab.tlv.redhat.com', 'elapsedTime': '-153281', 'vmType': 'kvm', 'cpuSys': '4.78', 'appsList': [u'rhevm-guest-agent-common-1.0.7-11.el6ev', u'kernel-2.6.32-358.el6'], 'guestOs': u'2.6.32-358.el6.x86_64', 'status': 'Paused', 'hash': '188640836589636024', 'vmId': 'b18d74ad-b087-441c-9e9d-2a1112927a7e', 'displayIp': '0', 'lastLogout': 1371807745.481838, 'lastUser': u'None', 'clientIp': '', 'kvmEnable': 'true', 'disks': {u'vda': {'readLatency': '0', 'apparentsize': '1073741824', 'writeLatency': '1446138', 'imageID': 'a79175bf-08e8-4854-8acb-74d21a65366a', 'flushLatency': '253698', 'readRate': '0.00', 'truesize': '1073741824', 'writeRate': '6946.87'}, u'hdc': {'readLatency': '0', 'apparentsize': '0', 'writeLatency': '0', 'flushLatency': '0', 'readRate': '0.00', 'truesize': '0', 'writeRate': '0.00'}}, 'monitorResponse': '0', 'statsAge': '-321435.10', 'cpuUser': '10.40', 'lastLogin': 1371639608.1277881, 'guestIPs': u'10.35.161.17', 'netIfaces': [{u'name': u'eth0', u'inet6': [u'fe80::21a:4aff:fe16:9711'], u'inet': [u'10.35.161.17'], u'hw': u'00:1a:4a:16:97:11'}]}]} Thread-278748::DEBUG::2013-06-19 16:11:39,438::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmShutdown with ('b18d74ad-b087-441c-9e9d-2a1112927a7e', '30', 'System Administrator has initiated shutdown of this Virtual Machine. Virtual Machine is shutting down.') {} flowID [36fa4a1e] libvirtEventLoop::ERROR::2013-06-19 16:12:51,320::libvirtvm::2255::vm.Vm::(_onBlockJobEvent) vmId=`b18d74ad-b087-441c-9e9d-2a1112927a7e`::Live merge completed for an unexpected path: /rhev/data-center/ba63f0c8-37ca-492b-b28c-55ae09c92a6a/ebe26784-f182-430f-9422-ddc162ca064c/images/a79175bf-08e8-4854-8acb-74d21a65366a/6df1ac7e-1992-4789-b3c3-de7a7f1d984f libvirtEventLoop::DEBUG::2013-06-19 16:12:51,321::libvirtvm::3060::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`b18d74ad-b087-441c-9e9d-2a1112927a7e`::event Stopped detail 0 opaque None Thread-278838::DEBUG::2013-06-19 16:12:52,581::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmGetStats with ('b18d74ad-b087-441c-9e9d-2a1112927a7e',) {} flowID [2eef795c] Thread-278838::DEBUG::2013-06-19 16:12:52,581::BindingXMLRPC::920::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'hash': '188640836589636024', 'exitMessage': 'User shut down', 'vmId': 'b18d74ad-b087-441c-9e9d-2a1112927a7e', 'timeOffset': 428402L, 'exitCode': 0}]} Thread-278840::DEBUG::2013-06-19 16:12:52,676::BindingXMLRPC::913::vds::(wrapper) client [10.35.64.13]::call vmDestroy with ('b18d74ad-b087-441c-9e9d-2a1112927a7e',) {} flowID [2eef795c] Thread-278840::DEBUG::2013-06-19 16:12:52,679::BindingXMLRPC::920::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}} 2013-06-24 16:13:06,717 WARN [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (http-/0.0.0.0:8443-5) [5dfb4d76] CanDoAction of action RemoveSnapshot failed. Reasons:ACTION_TYPE_FAILED_VM_IS_DURING_SNAPSHOT,VAR__TYPE__SNAPSHOT,VAR__ACTION__REMOVE 44e0467c-9e18-4262-9f9a-ba232aee7523::DEBUG::2013-06-19 17:40:31,263::threadPool::55::Misc.ThreadPool::(setRunningTask) Number of running tasks: 10 2013-06-24 17:40:23,628 INFO [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-55) [2439cb3d] BaseAsyncTask::OnTaskEndSuccess: Task 44e0467c-9e18-4262-9f9a-ba232aee7523 (Parent Command LiveMigrateDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended successfully. 2013-06-24 17:40:23,629 INFO [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-27) [2439cb3d] EntityAsyncTask::EndCommandAction [within thread] context: Attempting to EndAction LiveMigrateDisk, executionIndex: 1 2013-06-24 17:40:23,720 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-3-thread-27) [db2f067] Ending command with failure: org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand 2013-06-24 17:40:23,720 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (pool-3-thread-27) [db2f067] Cannot get vdsManager for vdsid=00000000-0000-0000-0000-000000000000 2013-06-24 17:40:23,721 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (pool-3-thread-27) [db2f067] CreateCommand failed: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found Thread-282346::DEBUG::2013-06-19 17:40:33,141::taskManager::93::TaskManager::(getTaskStatus) Entry. taskID: 44e0467c-9e18-4262-9f9a-ba232aee7523 Thread-282346::DEBUG::2013-06-19 17:40:33,142::taskManager::96::TaskManager::(getTaskStatus) Return. Response: {'code': 0, 'message': '1 jobs completed successfully', 'taskState': 'finished', 'taskResult': 'success', 'taskID': '44e0467c-9e18-4262-9f9a-ba232aee7523'} Thread-316530::DEBUG::2013-06-20 09:20:24,138::task::579::TaskManager.Task::(_updateState) Task=`8e7c5fa7-a848-47d6-9a0e-85ef8399f763`::moving from state init -> state preparing Thread-316530::INFO::2013-06-20 09:20:24,138::logUtils::40::dispatcher::(wrapper) Run and protect: moveImage(spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', srcDomUUID='ebe26784-f182-430f-9422-ddc162ca064c', dstDomUUID='ae744a61-0e73-480e-a113-14649d0e8e73', imgUUID='a79175bf-08e8-4854-8acb-74d21a65366a', vmUUID='', op=2, postZero='false', force='false') 8e7c5fa7-a848-47d6-9a0e-85ef8399f763::ERROR::2013-06-20 09:22:33,559::image::532::Storage.Image::(_createTargetImage) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/share/vdsm/storage/blockSD.py", line 610, in createVolume volUUID, desc, srcImgUUID, srcVolUUID) File "/usr/share/vdsm/storage/volume.py", line 418, in create raise se.VolumeAlreadyExists(volUUID) VolumeAlreadyExists: Volume already exists: ('c0b26357-538a-46fa-b5ba-e491c8ee7c7e',) Lots of storage errors in the middle including a previous lsm which failed and caused same error, and: ad1f3ea8-a01b-4bbc-8912-7d115d5008df::DEBUG::2013-06-19 16:09:15,908::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /sbin/lvm lvcreate --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a%3514f0c542a000184|36006016012823000787411ed01d9e211|36006016012823000daf89aa8c7d8e211%\', \'r%.*%\' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --autobackup n --contiguous n --size 12288m --addtag OVIRT_VOL_INITIALIZING --name 4a1d6bd4-b874-422d-aa64-acf7c28c10b5 ae744a61-0e73-480e-a113-14649d0e8e73' (cwd None) ad1f3ea8-a01b-4bbc-8912-7d115d5008df::DEBUG::2013-06-19 16:09:17,252::misc::83::Storage.Misc.excCmd::(<lambda>) FAILED: <err> = ' /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 2199023190016: Input/output error\n /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 2199023247360: Input/output error\n /dev/mapper/36006016012823000daf89aa8c7d8e211: read failed after 0 of 4096 at 0: Input/output error\n WARNING: Error counts reached a limit of 3. Device /dev/mapper/36006016012823000daf89aa8c7d8e211 was disabled\n WARNING: This metadata update is NOT backed up\n device-mapper: create ioctl on ae744a61--0e73--480e--a113--14649d0e8e73-4a1d6bd4--b874--422d--aa64--acf7c28c10b5 failed: Device or resource busy\n Failed to activate new LV.\n WARNING: This metadata update is NOT backed up\n'; <rc> = 5 Thread-281333::WARNING::2013-06-19 17:13:44,418::blockSD::172::Storage.BlockSD::(getAllVolumes) Found broken image 5ff884cf-dddc-4cb1-9f9b-1803d606f9d6, orphan volume ae744a61-0e73-480e-a113-14649d0e8e73/bcac1899-987d-4b79-9461-c526b01dec37, parent 84de075b-9b0c-4af3-aca5-7bc1b3a82ddd Searched for: b18d74ad-b087-441c-9e9d-2a1112927a7e|syncImage|Thread-278504:|6df1ac7e-1992-4789-b3c3-de7a7f1d984f|Thread-278447:|7667b392-6998-4cdd-b4ab-c41006d72347|Thread-278512:|Thread-278513:|ba6e3d5a-8a12-45da-aa8f-5742adc8246d:|FAILED|^44e0467c-9e18-4262-9f9a-ba232aee7523|Thread-278570:|I am So from the logs above: 1. user called LSM 2. during syncImageData, user shut down the VM 3. when syncImageData finished, engine threw an exception 4. moveImage fails on VolumeAlreadyExists Still need to track further to see how we reach the 'There is no leaf in the image' error. 2013-06-24 17:40:23,721 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (pool-3-thread-27) [db2f067] CreateCommand failed: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found That is fixed by: http://gerrit.ovirt.org/#/c/19876/ (In reply to Ayal Baron from comment #6) > So from the logs above: > 1. user called LSM > 2. during syncImageData, user shut down the VM > 3. when syncImageData finished, engine threw an exception > 4. moveImage fails on VolumeAlreadyExists The exception at point 3 should be covered (as per comment 7) by: http://gerrit.ovirt.org/#/c/19876/ > Still need to track further to see how we reach the 'There is no leaf in the > image' error. The 'no leaf in image' error happens before the flow analyzed in comment 6 and it's generated by a different run. As per cloneImageStructure in sp.py and __getResourceCandidatesList in resourceFactories.py: def cloneImageStructure(self, sdUUID, imgUUID, dstSdUUID): ... # Preparing the ordered resource list to be acquired resList = (rmanager.acquireResource(*x) for x in sorted(( (srcImgResNs, imgUUID, rm.LockType.shared), (dstImgResNs, imgUUID, rm.LockType.exclusive), ))) ... def __getResourceCandidatesList(self, resourceName, lockType): ... try: chain = image.Image(repoPath).getChain(sdUUID=self.sdUUID, imgUUID=resourceName) except se.ImageDoesNotExistInSD: log.debug("Image %s does not exist in domain %s", resourceName, self.sdUUID) return [] ... The resource manager always tries to find the image using getChain and in this case on the destination there was an incomplete leftover (maybe from a previous run?). Logs: Thread-274244::DEBUG::2013-06-19 14:06:55,136::task::579::TaskManager.Task::(_updateState) Task=`9e015034-826d-4d51-93a3-c8caea3c179c`::moving from state init -> state preparing Thread-274244::INFO::2013-06-19 14:06:55,137::logUtils::40::dispatcher::(wrapper) Run and protect: cloneImageStructure(spUUID='ba63f0c8-37ca-492b-b28c-55ae09c92a6a', sdUUID='ebe26784-f182-430f-9422-ddc162ca064c', imgUUID='c95c5fec-9e3d-4f30-bb75-728a8e772ccf', dstSdUUID='ae744a61-0e73-480e-a113-14649d0e8e73') ... 9e015034-826d-4d51-93a3-c8caea3c179c::DEBUG::2013-06-19 14:06:57,710::resourceManager::486::ResourceManager::(registerResource) Trying to register resource 'ae744a61-0e73-480e-a113-14649d0e8e73_imageNS.c95c5fec-9e3d-4f30-bb75-728a8e772ccf' for lock type 'exclusive' ... 9e015034-826d-4d51-93a3-c8caea3c179c::ERROR::2013-06-19 14:06:58,529::image::332::Storage.Image::(getChain) There is no leaf in the image c95c5fec-9e3d-4f30-bb75-728a8e772ccf 9e015034-826d-4d51-93a3-c8caea3c179c::WARNING::2013-06-19 14:06:58,530::resourceManager::520::ResourceManager::(registerResource) Resource factory failed to create resource 'ae744a61-0e73-480e-a113-14649d0e8e73_imageNS.c95c5fec-9e3d-4f30-bb75-728a8e772ccf'. Canceling request. Traceback (most recent call last): File "/usr/share/vdsm/storage/resourceManager.py", line 518, in registerResource obj = namespaceObj.factory.createResource(name, lockType) File "/usr/share/vdsm/storage/resourceFactories.py", line 193, in createResource lockType) File "/usr/share/vdsm/storage/resourceFactories.py", line 122, in __getResourceCandidatesList imgUUID=resourceName) File "/usr/share/vdsm/storage/image.py", line 333, in getChain raise se.ImageIsNotLegalChain(imgUUID) ImageIsNotLegalChain: Image is not a legal chain: ('c95c5fec-9e3d-4f30-bb75-728a8e772ccf',) As we see the resource we're trying to acquire here is the destination one (exclusive) on the target storage domain ae744a61-0e73-480e-a113-14649d0e8e73. In that domain there was still probably a partial leftover of image c95c5fec-9e3d-4f30-bb75-728a8e772ccf from a previous run, and I've been able to find some trace of it in the metadata: DOMAIN=ae744a61-0e73-480e-a113-14649d0e8e73 VOLTYPE=INTERNAL CTIME=1371720608 FORMAT=RAW IMAGE=c95c5fec-9e3d-4f30-bb75-728a8e772ccf DISKTYPE=2 PUUID=00000000-0000-0000-0000-000000000000 LEGALITY=LEGAL MTIME=1371722493 POOL_UUID= SIZE=25165824 TYPE=PREALLOCATED DESCRIPTION= EOF It is possible to reproduce this manually: # lvs @IU_5cc65754-ab9c-44f3-b088-4e5faa025bc3 LV VG Attr LSize 048e78d7-f15a-44f2-8596-865070bde72b 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m 0625195c-c1fb-4a20-be66-789921bcfe68 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m # vdsClient 0 cloneImageStructure 98da1408-948d-4cab-9a8b-418914be9f07 864df2d0-b022-435f-a9d1-b7ac0bd766bb 5cc65754-ab9c-44f3-b088-4e5faa025bc3 c8c60dca-3ec8-4ea0-8135-d929070055cb # lvs @IU_5cc65754-ab9c-44f3-b088-4e5faa025bc3 LV VG Attr LSize 048e78d7-f15a-44f2-8596-865070bde72b 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m 0625195c-c1fb-4a20-be66-789921bcfe68 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m 048e78d7-f15a-44f2-8596-865070bde72b c8c60dca-3ec8-4ea0-8135-d929070055cb -wi------ 128.00m 0625195c-c1fb-4a20-be66-789921bcfe68 c8c60dca-3ec8-4ea0-8135-d929070055cb -wi------ 128.00m # lvremove c8c60dca-3ec8-4ea0-8135-d929070055cb/0625195c-c1fb-4a20-be66-789921bcfe68 Logical volume "0625195c-c1fb-4a20-be66-789921bcfe68" successfully removed # lvs @IU_5cc65754-ab9c-44f3-b088-4e5faa025bc3 LV VG Attr LSize 048e78d7-f15a-44f2-8596-865070bde72b 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m 0625195c-c1fb-4a20-be66-789921bcfe68 864df2d0-b022-435f-a9d1-b7ac0bd766bb -wi------ 128.00m 048e78d7-f15a-44f2-8596-865070bde72b c8c60dca-3ec8-4ea0-8135-d929070055cb -wi------ 128.00m # vdsClient 0 cloneImageStructure 98da1408-948d-4cab-9a8b-418914be9f07 864df2d0-b022-435f-a9d1-b7ac0bd766bb 5cc65754-ab9c-44f3-b088-4e5faa025bc3 c8c60dca-3ec8-4ea0-8135-d929070055cb 6f189c2d-6d35-4757-bf9e-71a538f7288c::ERROR::2013-10-10 10:02:37,829::image::203::Storage.Image::(getChain) There is no leaf in the image 5cc65754-ab9c-44f3-b088-4e5faa025bc3 6f189c2d-6d35-4757-bf9e-71a538f7288c::WARNING::2013-10-10 10:02:37,829::resourceManager::590::ResourceManager::(registerResource) Resource factory failed to create resource 'c8c60dca-3ec8-4ea0-8135-d929070055cb_imageNS.5cc65754-ab9c-44f3-b088-4e5faa025bc3'. Canceling request. Traceback (most recent call last): File "/usr/share/vdsm/storage/resourceManager.py", line 586, in registerResource obj = namespaceObj.factory.createResource(name, lockType) File "/usr/share/vdsm/storage/resourceFactories.py", line 193, in createResource lockType) File "/usr/share/vdsm/storage/resourceFactories.py", line 122, in __getResourceCandidatesList imgUUID=resourceName) File "/usr/share/vdsm/storage/image.py", line 204, in getChain raise se.ImageIsNotLegalChain(imgUUID) ImageIsNotLegalChain: Image is not a legal chain: ('5cc65754-ab9c-44f3-b088-4e5faa025bc3',) Anyway we don't have any indication on how that leftover was still/already present on the destination (no logs) but I think it might have been caused by a previous syncImageData that was keeping the lv open during the deletion. In my opinion there's nothing new to take care of here. This is probably a duplicate of both bug 957703 and bug 1000816 This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag. Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information: * Cause: What actions or circumstances cause this bug to present. * Consequence: What happens when the bug presents. * Fix: What was done to fix the bug. * Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug. For further details on the Cause, Consequence, Fix, Result format please refer to: https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes Thanks in advance. is29. image can be moved after unsuccessful LSM and merge snapshots. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0040.html |