Created attachment 696579 [details] engine.log from RHEV-M server Description of problem: Storage live migration of a thick boot disk results in 2 issues: the vm is paused due to "unknown storage error" and the disk itself is corrupted. Version-Release number of selected component (if applicable): RHEV 3.1, RHEL 6.3 guest, NetApp Clustered ONTAP 8.1.2, RHEV-H as shipped w/ RHEV 3.1 How reproducible: reproducible for thick disk, unable to reproduce on thin disk Steps to Reproduce: 1. Create RHEL 6.3 guest with thickly provisioned disk 2. Allow new VM to reboot post-install 3. Initiate live migration of storage Actual results: Virtual machine is paused. VM will is not able to resume and requires a non-graceful shutdown. When the VM is restarted, the following message appears in the Spice console: "Booting from Hard, Disk Geom Error_" Reverting to snapshot results in a functional VM Expected results: Storage live migration of thick should result in functioning VM that does not pause and does not require a restore from snapshot Additional info:
Created attachment 696794 [details] logs from rhev-h node
(In reply to comment #2) > Created attachment 696794 [details] > logs from rhev-h node I couldn't correlate the engine logs to the vdsm/libvirt ones. Can you double check that you provided all the relevant logs? Are the engine/rhevh machine time synchronized? (that should make it easier for you to find them) The vdsm logs should contain calls for: vmSnapshot cloneImageStructure vmDiskReplicateStart syncImageData The libvirt log should contain calls for: __com.redhat_drive-mirror
Created attachment 697861 [details] logs for rhevh04
Created attachment 697863 [details] newer rhevm logs
Federico - I reproduced the issue again this morning for the sole purpose of getting you more complete logs. I've verified that the VM (live_mig_03) was running on thin hypervisor rhevh04.
There are two parts of the problem. One is that the drive mirroring *on the destination* is failing: 2013-02-15 15:06:42.733+0000: 10773: debug : qemuMonitorIOWrite:432 : QEMU_MONITOR_IO_WRITE: mon=0x7f28f000e840 buf={"execute":"__com.redhat_drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"/rhev/data-center/cf0a7a60-99d1-43de-88d6-a39c8829c00e/fc3dde9a-73f4-473d-8e2b-af0a1b74e185/images/b9a983f8-8099-45cc-9476-36c4bf9256f1/14af948e-3a4b-455b-9a32-7195c29ecb89","full":false,"speed":0,"mode":"existing","format":"qcow2"},"id":"libvirt-434"} 2013-02-15 15:06:42.764+0000: 10773: debug : qemuMonitorIOProcess:327 : QEMU_MONITOR_IO_PROCESS: mon=0x7f28f000e840 buf={"return": {}, "id": "libvirt-434"} 2013-02-15 15:06:42.765+0000: 10773: debug : qemuMonitorIOProcess:327 : QEMU_MONITOR_IO_PROCESS: mon=0x7f28f000e840 buf={"timestamp": {"seconds": 1360940802, "microseconds": 765542}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 21474836480, "offset": 0, "speed": 0, "type": "mirror", "error": "Input/output error"}} I don't have more information on it other than "Input/output error". Maybe you can try to give us more insights on this part (why do we get io errors on the destination?). That said, vdsm is correctly reporting the issue at vmDiskReplicateFinish time (returning 'Drive replication error'). Thread-530894::DEBUG::2013-02-15 15:08:53,911::BindingXMLRPC::894::vds::(wrapper) client [172.20.44.185]::call vmDiskReplicateFinish with ('7bb1c906-1f9d-41d0-8f55-b022b68185bc', {'device': 'disk', 'domainID': 'e2cebd7d-14c2-421a-b482-ad9072a95eaf', 'volumeID': '14af948e-3a4b-455b-9a32-7195c29ecb89', 'poolID': 'cf0a7a60-99d1-43de-88d6-a39c8829c00e', 'imageID': 'b9a983f8-8099-45cc-9476-36c4bf9256f1'}, {'device': 'disk', 'domainID': 'fc3dde9a-73f4-473d-8e2b-af0a1b74e185', 'volumeID': '14af948e-3a4b-455b-9a32-7195c29ecb89', 'poolID': 'cf0a7a60-99d1-43de-88d6-a39c8829c00e', 'imageID': 'b9a983f8-8099-45cc-9476-36c4bf9256f1'}) {} Thread-530894::ERROR::2013-02-15 15:08:53,918::libvirtvm::2121::vm.Vm::(diskReplicateFinish) vmId=`7bb1c906-1f9d-41d0-8f55-b022b68185bc`::Replication job not found for disk vda ({'device': 'disk', 'domainID': 'e2cebd7d-14c2-421a-b482-ad9072a95eaf', 'volumeID': '14af948e-3a4b-455b-9a32-7195c29ecb89', 'poolID': 'cf0a7a60-99d1-43de-88d6-a39c8829c00e', 'imageID': 'b9a983f8-8099-45cc-9476-36c4bf9256f1'}) Thread-530894::DEBUG::2013-02-15 15:08:53,933::BindingXMLRPC::900::vds::(wrapper) return vmDiskReplicateFinish with {'status': {'message': 'Drive replication error', 'code': 55}} Sadly this is not handled correctly by the engine, which goes ahead and deletes the source (instead of the broken destination): 2013-02-15 10:08:53,872 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-3-thread-27) START, VmReplicateDiskFinishVDSCommand(HostName = rhevh04.stl.netapp.com, HostId = 768a6dfe-59be-11e2-8429-52540046a077, vmId=7bb1c906-1f9d-41d0-8f55-b022b68185bc), log id: 4dbd9cdd 2013-02-15 10:08:53,909 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-3-thread-27) FINISH, VmReplicateDiskFinishVDSCommand, log id: 4dbd9cdd 2013-02-15 10:08:53,922 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-3-thread-27) START, DeleteImageGroupVDSCommand( storagePoolId = cf0a7a60-99d1-43de-88d6-a39c8829c00e, ignoreFailoverLimit = false, compatabilityVersion = null, storageDomainId = e2cebd7d-14c2-421a-b482-ad9072a95eaf, imageGroupId = b9a983f8-8099-45cc-9476-36c4bf9256f1, postZeros = false, forceDelete = false), log id: 3a934832 2013-02-15 10:08:55,904 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-3-thread-27) FINISH, DeleteImageGroupVDSCommand, log id: 3a934832 There is probably a problem in the error handling on the engine side but we should try to figure out what's the problem on the destination ("Input/output error").
The issue in the engine has already been solved as part of commit fbe1b4903e9b64d7eb12f264529432b9e60deca1 [VmReplicateDiskFinishTaskHandler throws an exception on VmReplicateDiskFinish failure - prevents deletion of the source disk]
Federico - what additional information do you need from me? If it's for your question "(why do we get io errors on the destination?)", I have no idea.. it works fine with a thin guest, but breaks on a thick guest.. I'm not getting any errors on the storage side of things.
Reducing severity since the problematic behaviour was already solved as noted in comment 8
Federico - my lab is back to 100%. I've redeployed one of my hypervisors as a RHEL 6.3 thick hypervisor. The issue remains, but I'm getting different symptoms/errors now on the thick hypervisor as compared to the thin. For example, in the original issue, the live storage migration was reported as successful, but the VM would pause due to "unknown storage error". Using the thick hypervisor, it reports that the live storage migration failed, but the disk is clearly visible in the new storage area. Additionally, it now reports that the disk is "thin provisioned" (perhaps that is expected?). The other difference is that on the thin-h, when I attempted to restart the VM it failed with "Booting from Hard, Disk Geom Error_" appearing in the Spice console.. using the thick-h, I don't even get to open the Spice console before the VM start fails with "Exit message:'truesize'". Let me know all of the logs and files that you would like me to attach from the thick-h and from rhev-m.
Jon, let's try to stick to the rhel-6.3+vdsm (thick-h) because you can benefit of newer packages and the debug is easier. Can you try a live storage migration and attach here the logs (engine.log and vdsm.log). Can you also provide the components versions (vdsm/qemu-kvm-rhev/libvirt). Thanks.
Created attachment 703894 [details] engine log for failed Live Storage Migration w/ thick-h
Created attachment 703895 [details] vdsm log for failed Live Storage Migration w/thick-h
Created attachment 703911 [details] output of `qemu-img info` on src & dst, vdsm log, engine log, messages log
Federico - not sure what happened to my notes, but I just reran the test as follows: created a new (thick) 20GB VM Made the thick-h the SPM initiated the storage live migration captured output of `qemu-img info image_name` on src and destination images (saved output in txt files) captured vdsm, engine, and messages logs As you'll see in the txt files and the logs, at the end it fails to find the copied image. I ran `ls` and it complained of stale file handles.
(In reply to comment #16) > Federico - not sure what happened to my notes, but I just reran the test as > follows: Thank you Jon for the additional information you provided, they have been critical to understand the issue. The involved piece of code is (image.py:~64): if (volParams['volFormat'] == volume.COW_FORMAT or volParams['prealloc'] == volume.SPARSE_VOL): volTmpSize = volParams['size'] else: volTmpSize = TEMPORARY_VOLUME_SIZE # in sectors (10M) In some way you have been able to create a base disk that is raw preallocated. Did you use the Rest API? (as far as I remember it's blocked from the UI). You always mentioned a thick disk but I always intended it as raw (sparse) and not a raw preallocated. Now, that said, the piece of code I quoted is designed to optimize out the preallocated volumes and just create a 10M placeholder (for other flows). This generates the snapshot on top as a 10M virtual size (in the qcow header), as we can observe from your (destination) qemu-img info: [root@rhevh01 68457151-b4c3-4f57-8b12-2bd186ca5a7e]# qemu-img info b9ad75b4-ec84-496d-b60a-01f30e03f754 image: b9ad75b4-ec84-496d-b60a-01f30e03f754 file format: qcow2 virtual size: 10M (10485760 bytes) disk size: 140K cluster_size: 65536 When the VM tries to mirror any data with offset > 10M it receives an IOError and the sync fails. I'm working on a solution for this. Thanks again.
Federico - I did not use the rest api to create the VM or disk image. I only used the web admin tool.
(In reply to comment #18) > Federico - I did not use the rest api to create the VM or disk image. I only > used the web admin tool. Ack, raw preallocated base is fully supported...
image: placeholder optimization on preallocation When the destination of a copy or move command is an NFS domain we should try to optimize the placeholders creation avoiding unnecessary prezeroing (only relevant case: RAW PREALLOCATED). In such case in the past we were using a TEMPORARY_VOLUME_SIZE to create a placeholder with a temporary size of 10Mb. This is interfering with live storage migration as the following volumes in the chain are inheriting the temporary size inside the qcow header (virtual disk size). With this patch we are instead forcing the creation of the volume as SPARSE as this won't have any side effect on the qcow header. http://gerrit.ovirt.org/#/c/12692/ Change-Id: I32811bd45320ef02bfc593a71dfdafc0b0550c7d
Federico - I finally had the opportunity to try the new vdsm packages that you posted. Unfortunately, the storage live migration still failed. I'll post logs and output from `qemu-img info _img_` Jon
Created attachment 707223 [details] logs from thick-h tried storage live migration w/ 4 test vdsm packages from Federico S.
Created attachment 707224 [details] "qemu-img info" output from source and destination storage domains
Created attachment 707417 [details] vdsm log for latest failed storage live migration forgot to attach this last night
Created attachment 707418 [details] engine log from latest failed storage live migration forgot to add this last night
(In reply to comment #24) > Created attachment 707417 [details] > vdsm log for latest failed storage live migration > > forgot to attach this last night From the vdsm log it looks like the host (where the VM was running) was not the SPM. Since the fix actually resolves a problem in cloneImageStructure you should double check that you used the patched rpm on the SPM too. Thanks.
Created attachment 708366 [details] vdsm, engine, and qemu-img logs/txt files
Federico - the latest tarball contains the vdsm log, engine log, and the qemu-img text files. I confirmed that the thick-h is both the host for the Storage Live Migration VM and the SPM. The Storage Live Migration itself worked this time, although the event log in RHEVM says that it failed. Also, qemu-img was no longer able to get information from the disk images once the storage live migration finished.
The last test (comment 28) is not valid. Please disregard.
Created attachment 708476 [details] vdsm and engine logs from successfull run of storage live migration w/ Federico's patches.
The latest test w/ Federico's patches was successful. This looks to be fixed on my end. Thanks, Federico.
moved a preallocated disk on NFS storage with no failure verfied on sf13 with vdsm-4.10.2-14.0.el6ev.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0886.html