Bug 1011543 - When deleting snapshots created for Live Storage Migration, RHEV removes the source disk rather than the snapshot
Summary: When deleting snapshots created for Live Storage Migration, RHEV removes the ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: All
OS: Linux
high
urgent
Target Milestone: ---
: 3.2.4
Assignee: Daniel Erez
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On: 1005256
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-24 13:57 UTC by rhev-integ
Modified: 2020-08-03 15:42 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-10-27 18:51:15 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)

Description rhev-integ 2013-09-24 13:57:08 UTC
+++ This bug is a RHEV-M zstream clone. The original bug is: +++
+++   https://bugzilla.redhat.com/show_bug.cgi?id=1005256. +++

======================================================================



----------------------------------------------------------------------
Following comment by avoss on September 06 at 14:12:57, 2013

Description of problem:
When deleting the snapshot created for Live Storage Migration, it seems to have swapped the snapshot and the original image. The resulting preallocated image is marked in the database as being the size of the snapshot, and the image isn't bootable

Version-Release number of selected component (if applicable):
rhevm-3.2.1-0.39.el6ev
RHEV-H 20130709.0.el6_4

How reproducible:
unknown

Steps to Reproduce:
1. Live migrate several VMs with preallocated disks to another storage domain
2. While VMs are down, attempt to remove the snapshot created by the migration
3. Attempt to start those VMs

Actual results:
- VM refuses to boot
- Disk image in database listed as the size of the snapshot, rather than the disk size
- Image on disk is full size of original preallocated image

Expected results:
- successful VM boot
- correct image size in database

Additional info:
In the hopes that the problem was simply incorrect size data in teh database, I modified the disk_image_dynamic table to match the size on disk. Image still isn't bootable.

----------------------------------------------------------------------
Following comment by avoss on September 06 at 14:19:10, 2013

Following l-dtarbox-vm01 in audit_log.txt:

$ grep l-dtarbox-vm01 audit_log.txt 
 2013-08-13 14:26:54.903-04 | 63edf907       | Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'l-dtarbox-vm01' was initiated by klussier.
 2013-08-13 14:27:39.321-04 |                | Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'l-dtarbox-vm01' has been completed.
 2013-08-13 14:27:40.813-04 | 18ddece7       | User klussier moving disk l-dtarbox-vm01_Disk1 to domain DWI_LUN12.
 2013-08-13 14:46:27.955-04 | 18ddece7       | User klussier finished moving disk l-dtarbox-vm01_Disk1 to domain DWI_LUN12.
 2013-08-14 10:38:02.617-04 | 118cfc27       | VM shutdown initiated by klussier on VM l-dtarbox-vm01 (Host: dwi1.dwi.litle.com).
 2013-08-14 10:39:27.341-04 |                | VM l-dtarbox-vm01 is down. Exit message: User shut down
 2013-08-14 10:40:41.695-04 | cef691         | Snapshot 'Auto-generated for Live Storage Migration' deletion for VM 'l-dtarbox-vm01' was initiated by klussier.
 2013-08-14 11:23:48.846-04 | cef691         | Snapshot 'Auto-generated for Live Storage Migration' deletion for VM 'l-dtarbox-vm01' has been completed.
 2013-08-15 07:17:22.462-04 | 2ac2516a       | VM l-dtarbox-vm01 was started by klussier (Host: dwi3.dwi.litle.com).
 2013-08-15 07:17:52.841-04 | 2ac2516a       | VM l-dtarbox-vm01 started on Host dwi3.dwi.litle.com
 2013-08-15 10:54:28.243-04 | 2964fa97       | Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'l-dtarbox-vm01' was initiated by klussier.
 2013-08-15 10:54:49.492-04 |                | Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'l-dtarbox-vm01' has been completed.
 2013-08-15 10:54:51.058-04 | 693a69c        | User klussier moving disk l-dtarbox-vm01_Disk1 to domain DWI_LUN10.
 2013-08-15 12:10:39.239-04 | 693a69c        | User klussier finished moving disk l-dtarbox-vm01_Disk1 to domain DWI_LUN10.
 2013-08-15 16:58:27.657-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 09:52:36.613-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 09:57:36.259-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 10:37:05.586-04 | 1d65e755       | Migration completed (VM: l-dtarbox-vm01, Source: dwi3.dwi.litle.com, Destination: dwi5.dwi.litle.com).
 2013-08-16 10:40:18.249-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 11:01:04.458-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 11:11:10.561-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 11:35:10.888-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 15:31:08.184-04 | 14392ba1       | Migration completed (VM: l-dtarbox-vm01, Source: dwi5.dwi.litle.com, Destination: dwi4.dwi.litle.com).
 2013-08-16 15:35:02.851-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 16:07:42.045-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 16:38:33.598-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 16:47:09.277-04 | 1aad1b76       | VM shutdown initiated by klussier on VM l-dtarbox-vm01 (Host: dwi4.dwi.litle.com).
 2013-08-16 16:51:54.208-04 |                | VM l-dtarbox-vm01 is not responding.
 2013-08-16 16:57:36.499-04 |                | VM l-dtarbox-vm01 is down. Exit message: User shut down
 2013-08-16 18:15:28.333-04 | 78c98ddf       | Snapshot 'Auto-generated for Live Storage Migration' deletion for VM 'l-dtarbox-vm01' was initiated by klussier.
 2013-08-16 18:32:44.257-04 | 78c98ddf       | Snapshot 'Auto-generated for Live Storage Migration' deletion for VM 'l-dtarbox-vm01' has been completed.
 2013-08-16 18:42:43.155-04 | 6c48e082       | VM l-dtarbox-vm01 was started by klussier (Host: dwi4.dwi.litle.com).
 2013-08-16 18:43:43.572-04 | 6c48e082       | VM l-dtarbox-vm01 started on Host dwi4.dwi.litle.com

I can't find the correlation id 78c98ddf in any vdsm.log file on any hypervisor... but xzgrep 0d30be2e-0882-4627-a4a1-f673fe784dca vdsm.log.39.xz on dwi1:

Thread-3657::INFO::2013-08-16 21:38:29,236::sp::1275::Storage.StoragePool::(updateVM) vmUUID=73486858-7479-471f-9ef6-57feb5a9c484 imgList=['0d30be2e-0882-4627-a4a1-f673fe784dca']
Thread-5164::INFO::2013-08-16 22:15:26,897::logUtils::40::dispatcher::(wrapper) Run and protect: getVolumeInfo(sdUUID='29a56c30-eacd-4b49-819b-09c8a88cb26f', spUUID='e5609057-ecc
9-4b5d-bd3e-5030465186eb', imgUUID='0d30be2e-0882-4627-a4a1-f673fe784dca', volUUID='eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', options=None)
Thread-5164::INFO::2013-08-16 22:15:26,902::volume::797::Storage.Volume::(getInfo) Info request: sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e 
Thread-5164::INFO::2013-08-16 22:15:27,457::volume::826::Storage.Volume::(getInfo) 29a56c30-eacd-4b49-819b-09c8a88cb26f/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e info is {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': 'fcb5cf3e-3c70-4810-90de-58afdcefb9bf', 'format': 'COW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376578498', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376578498', 'apparentsize': '3221225472', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '3221225472', 'type': 'SPARSE'}
Thread-5164::INFO::2013-08-16 22:15:27,457::logUtils::42::dispatcher::(wrapper) Run and protect: getVolumeInfo, Return response: {'info': {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': 'fcb5cf3e-3c70-4810-90de-58afdcefb9bf', 'format': 'COW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376578498', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376578498', 'apparentsize': '3221225472', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '3221225472', 'type': 'SPARSE'}}
Thread-5164::DEBUG::2013-08-16 22:15:27,457::task::1168::TaskManager.Task::(prepare) Task=`fdb14f76-6e50-450b-bad8-e0bbaa93e7f9`::finished: {'info': {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': 'fcb5cf3e-3c70-4810-90de-58afdcefb9bf', 'format': 'COW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376578498', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376578498', 'apparentsize': '3221225472', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '3221225472', 'type': 'SPARSE'}}
Thread-5165::INFO::2013-08-16 22:15:27,507::logUtils::40::dispatcher::(wrapper) Run and protect: mergeSnapshots(sdUUID='29a56c30-eacd-4b49-819b-09c8a88cb26f', spUUID='e5609057-ecc9-4b5d-bd3e-5030465186eb', vmUUID='', imgUUID='0d30be2e-0882-4627-a4a1-f673fe784dca', ancestor='fcb5cf3e-3c70-4810-90de-58afdcefb9bf', successor='eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', postZero='false')
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,370::task::889::TaskManager.Task::(_runJobs) Task=`68a5d413-6c1f-4ef7-be03-78bd1b240f6d`::Task.run: running job 0: mergeSnapshots: <bound method StoragePool.mergeSnapshots of <storage.sp.StoragePool object at 0x7fd71dda2810>> (args: ('29a56c30-eacd-4b49-819b-09c8a88cb26f', '', '0d30be2e-0882-4627-a4a1-f673fe784dca', 'fcb5cf3e-3c70-4810-90de-58afdcefb9bf', 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', False) kwargs: {})
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,374::task::315::TaskManager.Task::(run) Task=`68a5d413-6c1f-4ef7-be03-78bd1b240f6d`::Job.run: running mergeSnapshots: <bound method StoragePool.mergeSnapshots of <storage.sp.StoragePool object at 0x7fd71dda2810>> (args: ('29a56c30-eacd-4b49-819b-09c8a88cb26f', '', '0d30be2e-0882-4627-a4a1-f673fe784dca', 'fcb5cf3e-3c70-4810-90de-58afdcefb9bf', 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', False) kwargs: {}) callback None
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,381::resourceManager::175::ResourceManager.Request::(__init__) ResName=`29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca`ReqID=`f480fc51-59d6-417f-bc71-9f0287a38d7e`::Request was made in '/usr/share/vdsm/storage/resourceManager.py' line '174' at '__init__'
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,383::resourceManager::486::ResourceManager::(registerResource) Trying to register resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca' for lock type 'exclusive'
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:15:28,705::image::344::Storage.Image::(getChain) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca chain=[<storage.blockVolume.BlockVolume object at 0x7fd71dc0ce90>, <storage.blockVolume.BlockVolume object at 0x7fd71c2d2a90>] 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,707::resourceManager::528::ResourceManager::(registerResource) Resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca' is free. Now locking as 'exclusive' (1 active user)
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:28,707::resourceManager::212::ResourceManager.Request::(grant) ResName=`29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca`ReqID=`f480fc51-59d6-417f-bc71-9f0287a38d7e`::Granted request
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:15:28,708::image::1123::Storage.Image::(merge) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f vmUUID= imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca ancestor=fcb5cf3e-3c70-4810-90de-58afdcefb9bf successor=eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e postZero=False
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:15:29,062::image::1162::Storage.Image::(merge) merge with convert: src = /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e dst = /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/fcb5cf3e-3c70-4810-90de-58afdcefb9bf
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:15:29,822::blockVolume::174::Storage.Volume::(_create) Request to create RAW volume /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e_MERGE with size = 209715200 sectors
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:30,088::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a%360060160b610340002898da3ddb8e211|360060160b61034000aae6db4ddb8e211|360060160b6103400147afd08d8b8e211|360060160b61034002040dd90d5b8e211|360060160b61034002aabfc94ddb8e211|360060160b61034002cc9c272ddb8e211|360060160b61034007cf4000cd6b8e211|360060160b61034009490e066ddb8e211|360060160b610340096054d5bddb8e211|360060160b61034009cec7bd0e5bbe211|360060160b6103400ce6ec082ddb8e211|360060160b6103400e8f19e5984ede211|36d4ae520b6ff20001938d44805e2e515%\', \'r%.*%\' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --autobackup n --deltag OVIRT_VOL_INITIALIZING --addtag IU_0d30be2e-0882-4627-a4a1-f673fe784dca --addtag PU_00000000-0000-0000-0000-000000000000 --addtag MD_13 29a56c30-eacd-4b49-819b-09c8a88cb26f/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e_MERGE' (cwd None)
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:31,487::volume::1012::Storage.Volume::(qemuConvert) (qemuConvert): COPY /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e (qcow2) to /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e_MERGE (raw) START
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:15:31,487::misc::83::Storage.Misc.excCmd::(<lambda>) '/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/qemu-img convert -t none -f qcow2 /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e -O raw /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e_MERGE' (cwd None)
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:32:30,951::volume::1029::Storage.Volume::(qemuConvert) (qemuConvert): COPY /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e to /rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e_MERGE DONE
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:33,734::volume::582::Storage.Volume::(setLegality) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = _remove_me_DufeiwOT_eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e legality = ILLEGAL 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:33,742::volume::582::Storage.Volume::(setLegality) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = fcb5cf3e-3c70-4810-90de-58afdcefb9bf legality = ILLEGAL 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:33,748::image::950::Storage.Image::(removeSubChain) Remove volume _remove_me_DufeiwOT_eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e from image 0d30be2e-0882-4627-a4a1-f673fe784dca
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:33,749::blockVolume::207::Storage.Volume::(delete) Request to delete LV _remove_me_DufeiwOT_eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e of image 0d30be2e-0882-4627-a4a1-f673fe784dca in VG 29a56c30-eacd-4b49-819b-09c8a88cb26f 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:33,749::volume::582::Storage.Volume::(setLegality) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = _remove_me_DufeiwOT_eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e legality = ILLEGAL 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:34,735::image::950::Storage.Image::(removeSubChain) Remove volume fcb5cf3e-3c70-4810-90de-58afdcefb9bf from image 0d30be2e-0882-4627-a4a1-f673fe784dca
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:34,735::blockVolume::207::Storage.Volume::(delete) Request to delete LV fcb5cf3e-3c70-4810-90de-58afdcefb9bf of image 0d30be2e-0882-4627-a4a1-f673fe784dca in VG 29a56c30-eacd-4b49-819b-09c8a88cb26f 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:34,735::volume::582::Storage.Volume::(setLegality) sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = fcb5cf3e-3c70-4810-90de-58afdcefb9bf legality = ILLEGAL 
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::INFO::2013-08-16 22:32:34,973::image::1188::Storage.Image::(merge) Merge src=/rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/_remove_me_DufeiwOT_eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e with dst=/rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/29a56c30-eacd-4b49-819b-09c8a88cb26f/images/0d30be2e-0882-4627-a4a1-f673fe784dca/fcb5cf3e-3c70-4810-90de-58afdcefb9bf was successfully finished.
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:32:34,974::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca'
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:32:34,974::resourceManager::553::ResourceManager::(releaseResource) Released resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca' (0 active users)
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:32:34,974::resourceManager::558::ResourceManager::(releaseResource) Resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca' is free, finding out if anyone is waiting for it.
68a5d413-6c1f-4ef7-be03-78bd1b240f6d::DEBUG::2013-08-16 22:32:34,976::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource '29a56c30-eacd-4b49-819b-09c8a88cb26f_imageNS.0d30be2e-0882-4627-a4a1-f673fe784dca', Clearing records.
Thread-5902::INFO::2013-08-16 22:32:43,758::logUtils::40::dispatcher::(wrapper) Run and protect: getVolumeInfo(sdUUID='29a56c30-eacd-4b49-819b-09c8a88cb26f', spUUID='e5609057-ecc9-4b5d-bd3e-5030465186eb', imgUUID='0d30be2e-0882-4627-a4a1-f673fe784dca', volUUID='eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', options=None)
Thread-5902::INFO::2013-08-16 22:32:43,759::volume::797::Storage.Volume::(getInfo) Info request: sdUUID=29a56c30-eacd-4b49-819b-09c8a88cb26f imgUUID=0d30be2e-0882-4627-a4a1-f673fe784dca volUUID = eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e 
Thread-5902::INFO::2013-08-16 22:32:44,221::volume::826::Storage.Volume::(getInfo) 29a56c30-eacd-4b49-819b-09c8a88cb26f/0d30be2e-0882-4627-a4a1-f673fe784dca/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e info is {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376691330', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376691331', 'apparentsize': '107374182400', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '107374182400', 'type': 'PREALLOCATED'}
Thread-5902::INFO::2013-08-16 22:32:44,222::logUtils::42::dispatcher::(wrapper) Run and protect: getVolumeInfo, Return response: {'info': {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376691330', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376691331', 'apparentsize': '107374182400', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '107374182400', 'type': 'PREALLOCATED'}}
Thread-5902::DEBUG::2013-08-16 22:32:44,223::task::1168::TaskManager.Task::(prepare) Task=`220ba866-cea1-438d-a5d3-f20ec20b37a0`::finished: {'info': {'status': 'OK', 'domain': '29a56c30-eacd-4b49-819b-09c8a88cb26f', 'voltype': 'LEAF', 'description': '', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'image': '0d30be2e-0882-4627-a4a1-f673fe784dca', 'ctime': '1376691330', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '1376691331', 'apparentsize': '107374182400', 'children': [], 'pool': '', 'capacity': '107374182400', 'uuid': 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e', 'truesize': '107374182400', 'type': 'PREALLOCATED'}}

----------------------------------------------------------------------
Following comment by avoss on September 06 at 14:20:49, 2013

Metadata on disk:
#  dd if=/rhev/data-center/e5609057-ecc9-4b5d-bd3e-5030465186eb/mastersd/dom_md/metadata bs=512 count=1 skip=13
DOMAIN=ed9b094f-9469-4230-a90d-9cf5924964a4
VOLTYPE=LEAF
CTIME=1376601291
FORMAT=COW
IMAGE=e66e3a77-a584-49fb-a07b-709d4bca556b
DISKTYPE=2
PUUID=00000000-0000-0000-0000-000000000000
LEGALITY=LEGAL
MTIME=1376694199
POOL_UUID=
SIZE=209715200
TYPE=SPARSE
DESCRIPTION=
EOF
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00477796 s, 107 kB/s

# psql -U postgres engine -c "select image_guid, size, it_guid, vm_snapshot_id, volume_type, volume_format from images where image_group_id = '0d30be2e-0882-4627-a4a1-f673fe784dca';"
Password for user postgres: 
              image_guid              |     size     |               it_guid                |            vm_snapshot_id            | volume_type | volume_format 
--------------------------------------+--------------+--------------------------------------+--------------------------------------+-------------+---------------
 eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e | 107374182400 | 00000000-0000-0000-0000-000000000000 | 2db78268-aae4-443d-8c6a-40df8ffc3494 |           1 |             5
(1 row)


# psql -U postgres engine -c "select * from base_disks where disk_alias = 'l-dtarbox-vm01_Disk1';"
Password for user postgres: 
               disk_id                | disk_interface | wipe_after_delete | propagate_errors |      disk_alias      | disk_description | shareable | boot 
--------------------------------------+----------------+-------------------+------------------+----------------------+------------------+-----------+------
 0d30be2e-0882-4627-a4a1-f673fe784dca | VirtIO         | f                 | Off              | l-dtarbox-vm01_Disk1 |                  | f         | t
(1 row)

# psql -U postgres engine -c "select * from disk_image_dynamic where image_id = 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e';"
Password for user postgres: 
               image_id               | read_rate | write_rate | actual_size | read_latency_seconds | write_latency_seconds | flush_latency_seconds |         _update_date        
  
--------------------------------------+-----------+------------+-------------+----------------------+-----------------------+-----------------------+-----------------------------
--
 eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e |         0 |          0 |  3221225472 |          0.000000000 |           0.000000000 |           0.000000000 | 2013-08-20 08:54:11.576124-0
4
(1 row)

# psql -U postgres engine -c "select * from image_storage_domain_map where image_id = 'eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e';"
Password for user postgres: 
               image_id               |          storage_domain_id           
--------------------------------------+--------------------------------------
 eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e | 29a56c30-eacd-4b49-819b-09c8a88cb26f
(1 row)

----------------------------------------------------------------------
Following comment by derez on September 08 at 08:39:47, 2013

Hi Allan,

I can't find the latest engine logs (when the issue occurred),
i.e. from 2013-08-16 18:15 .. 2013-08-16 18:43.
The most recent I've found is in 'sosreport-LogCollector-20130816164722.tar.xz' 
(up to 2013-08-16 16:47).

Do you happen to still have these logs?

----------------------------------------------------------------------
Following comment by derez on September 09 at 11:30:26, 2013

Fede, perhaps you can figure out what went wrong from the vdsm log?

----------------------------------------------------------------------
Following comment by eedri on September 15 at 10:56:05, 2013

if this bug is approved for z-stream, please change target release to 3.2.X
so it will be tracked for cloning.

----------------------------------------------------------------------
Following comment by derez on September 16 at 11:08:26, 2013

Returning removed needinfo flags.

----------------------------------------------------------------------
Following comment by lyarwood on September 16 at 14:07:17, 2013

I've reviewed the latest LC and can't see any clear failures or issues with the actual merge. I'm asking for more information on the unbootable volume before proceeding here and will update with my findings and latest logs later today. Adding a needinfo against myself as a reminder.

----------------------------------------------------------------------
Following comment by lyarwood on September 16 at 17:17:38, 2013

Created attachment 798363 [details]
SFDC#00922784 - engine.log from merge.

(In reply to Daniel Erez from comment #3)
> Hi Allan,
> 
> I can't find the latest engine logs (when the issue occurred),
> i.e. from 2013-08-16 18:15 .. 2013-08-16 18:43.
> The most recent I've found is in
> 'sosreport-LogCollector-20130816164722.tar.xz' 
> (up to 2013-08-16 16:47).
> 
> Do you happen to still have these logs?

engine.log attached, full LC also here :

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/rhevm-2013082008541377003248/

Again I couldn't see anything wrong with the actual merge here. Attaching the SPM vdsm.log from this next.

----------------------------------------------------------------------
Following comment by lyarwood on September 16 at 17:24:43, 2013

Created attachment 798365 [details]
SFDC#00922784 - vdsm.log from SPM during merge.

Full SPM sosreport here :

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/dwi1-2013082012541377003287/

----------------------------------------------------------------------
Following comment by derez on September 16 at 18:09:05, 2013

After looking at the newly added logs, the issue looks exactly the same as bug 970889 which has already been cloned and merged into 3.2.z (bug 991542).

----------------------------------------------------------------------
Following comment by amureini on September 17 at 07:09:46, 2013

(In reply to Daniel Erez from comment #10)
> After looking at the newly added logs, the issue looks exactly the same as
> bug 970889 which has already been cloned and merged into 3.2.z (bug 991542).
The reproduction sounds somewhat different, even though the root cause seems to be the same.

Moving to ON_QA to verify that the provided fix covers this issue.
Aharon - if you feel this is redundant, feel free to CLOSE NEXTRELEASE

----------------------------------------------------------------------
Following comment by amureini on September 17 at 07:10:44, 2013

(In reply to Allon Mureinik from comment #11)
> (In reply to Daniel Erez from comment #10)
> > After looking at the newly added logs, the issue looks exactly the same as
> > bug 970889 which has already been cloned and merged into 3.2.z (bug 991542).
> The reproduction sounds somewhat different, even though the root cause seems
> to be the same.
> 
> Moving to ON_QA to verify that the provided fix covers this issue.
> Aharon - if you feel this is redundant, feel free to CLOSE NEXTRELEASE
CLOSE CURRENTREALSE, of course - this fix in included in 3.2.3.

----------------------------------------------------------------------
Following comment by lyarwood on September 17 at 09:27:23, 2013

(In reply to Daniel Erez from comment #10)
> After looking at the newly added logs, the issue looks exactly the same as
> bug 970889 which has already been cloned and merged into 3.2.z (bug 991542).

Daniel, could you elaborate on this?

While there is an instance of BZ#970889 in the engine.log I supplied I'm not sure how it applies to the l-dtarbox-vm01 guest listed in comment #1.

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/rhevm-2013082008541377003248/var/log/ovirt-engine/engine.log.1-20130817_033401_0034.xz

18534 2013-08-16 08:59:38,645 INFO  [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-4-thread-42) [2459bab1] Lock Acquired to object EngineLock [exclusiveLocks= key: 4b8bdba2-399a-4f52-a01b-a733fcdcda82 value: DISK
18535 , sharedLocks= key: a56bd7d2-1ac7-4ba4-b6da-13cf6f7fa907 value: VM
18536 ]
18537 2013-08-16 08:59:38,657 INFO  [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-4-thread-42) [2459bab1] Running command: LiveMigrateDiskCommandTask handler: CreateImagePlaceholderTaskHandler internal: true. Entities affec      ted :  ID: 4b8bdba2-399a-4f52-a01b-a733fcdcda82 Type: Disk,  ID: 977f3070-fa41-4765-acf6-277c1ca961dc Type: Storage
18538 2013-08-16 08:59:38,660 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.CloneImageGroupStructureVDSCommand] (pool-4-thread-42) [2459bab1] START, CloneImageGroupStructureVDSCommand( storagePoolId = e5609057-ecc9-4b5d-bd3e-5030465186eb, ignoreFailoverLimit = false, compatabilityVersion = null, storageDomainId = ea05219d-bfaf-42a6-80da-715d5eb0c442, imageGroupId = 4b8bdba2-399a-4f52-a01b-a733fcdcda82, dstDomainId = 977f3070-fa41-4765-acf6-277c1ca961dc), log id:       5ea819a5

The image doesn't actually relate to l-dtarbox-vm01 but visola-vm1, that has since been removed.

----------------------------------------------------------------------
Following comment by lyarwood on September 17 at 11:16:29, 2013

Just for completeness, the original LSM logs for the guest in question are listed below and do not show any sign of BZ#970889 AFAICT :

engine.log :

Correlation IDs : 

2964fa97 (LiveMigrateVmDisksCommand) 
693a69c  (LiveMigrateDiskCommand)

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/rhevm-2013082008541377003248/var/log/ovirt-engine/engine.log.1-20130817_033401_0034.xz

SPM vdsm.log :

3aead839-8cec-41f4-a1e1-b9e9571f31e0 (createVolume)
6f546531-fa5a-4d2f-af97-7596c07d20b8 (cloneImageStructure)
a8bde1dc-45fa-489d-94cb-221a92de5b3f (syncImageData) starts.      

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/dwi1-2013082012541377003287/var/log/vdsm/vdsm.log.52.xz

a8bde1dc-45fa-489d-94cb-221a92de5b3f (syncImageData) finishes.

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/dwi1-2013082012541377003287/var/log/vdsm/vdsm.log.51.xz

HSM vdsm.log :

2fe606f4-2a1f-46d0-b246-f5cdd22cf46e (vmDiskReplicateStart)
cdbde112-4419-4a1e-b076-ca37ecdeae9b (vmDiskReplicateFinish)

http://lyarwood-fab.usersys.redhat.com/tickets/00922784/dwi3-2013082012541377003287/var/log/vdsm/vdsm.log.33.xz

As with the merge I listed above I can't see any issues here that could later cause the guest to become unbootable.

To this end we have asked the customer for output from the following commands to review the current state of the now merged RAW base volume :

# lvchange -ay 29a56c30-eacd-4b49-819b-09c8a88cb26f/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e 
# hexdump -C /dev/29a56c30-eacd-4b49-819b-09c8a88cb26f/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e | head -n 100
# qemu-img info /dev/29a56c30-eacd-4b49-819b-09c8a88cb26f/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e
# lvchange -an 29a56c30-eacd-4b49-819b-09c8a88cb26f/eee960f4-9207-4e6b-a4ca-e5c4d45b2f5e

Could we move this back to assigned if we agree that this guest is not hitting BZ#970889? We will attach the case to BZ#970889 anyway as we have an instance of this in the logs.

----------------------------------------------------------------------
Following comment by derez on September 19 at 13:41:25, 2013

As discussed with Lee and Federico, the underling issue seems to be with the guest. The provided logs (engine/vdsm) don't reveal any errors during snapshot merge. Moving to ON_QA.

Comment 2 Aharon Canan 2013-10-14 13:15:06 UTC
it should be closed, (not sure even why with have clone if no issue exists)
anyway I rechecked it and everything looks fine
I also discussed it with derez, no real issue here.
marking verified.

Comment 3 Eyal Edri 2013-10-27 18:51:15 UTC
marking closed per comment #2.


Note You need to log in before you can comment on or make changes to this bug.