Hide Forgot
Description of problem: VM with snapshot that contains memory state cannot be exported. Version-Release number of selected component (if applicable): 3.6 (7a891290ffac4bcc4a0e119481b2c1b7ac0254e0) How reproducible: 100% Steps to Reproduce: 1. Create a VM 2. Take snapshot with memory 3. Export the VM (without collapse snapshots) Actual results: Export fails Expected results: VM is exported to the export domain Additional info: This is regression that is caused by the addition of cinder. In CopyImageGroupCommand#canDoAction we fetch the disk to be exported from the DB in order to validate the disk storage type. The problem is that in 3.6 memory snapshots are not represented as disks in the DB, therefore the canDoAction method returns false (without any exist reason).
The posted patch eliminates the can-do-action failure for memory volumes, but the problem remains. This time the error seems in the host, the metadata volume is not created in qcow2 format for some reason: 2bba5363-25ea-4a23-afdd-7bb96e6e10b9::ERROR::2015-11-15 23:56:39,282::image::490::Storage.Image::(_interImagesCopy) Copy image error: image=45f5db63-bf54-4631-bf8a-c5f1fb099796, src domain=1118b4b4-828a -41b6-95fc-c79c3e4d27dd, dst domain=6f94de72-f824-4512-b7d9-8e5abe2d88b6 Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 481, in _interImagesCopy self._wait_for_qemuimg_operation(operation) File "/usr/share/vdsm/storage/image.py", line 138, in _wait_for_qemuimg_operation operation.wait(self._QEMU_LOGGING_INTERVAL) File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 283, in wait raise QImgError(self._command.returncode, "", self.error) QImgError: ecode=1, stdout=, stderr=qemu-img: Could not open '/rhev/data-center/00000001-0001-0001-0001-00000000011a/1118b4b4-828a-41b6-95fc-c79c3e4d27dd/images/45f5db63-bf54-4631-bf8a-c5f1fb099796/9aa7 82de-2f18-4a0a-865f-e573f0605453': Image is not in qcow2 format , message=None
Created attachment 1094656 [details] vdsm log
Hi Arik, Which versions of vdsm/qemu are you using? I've tried to reproduce the issue and getting a different error: " 222fc76c-de1b-4220-86b8-4fe629768fa1::ERROR::2015-11-16 14:59:44,479::task::866::Storage.TaskManager.Task::(_setError) Task=`222fc76c-de1b-4220-86b8-4fe629768fa1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 332, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1557, in moveImage vmUUID, op, postZero, force) File "/usr/share/vdsm/storage/image.py", line 507, in move self._interImagesCopy(destDom, srcSdUUID, imgUUID, chains) File "/usr/share/vdsm/storage/image.py", line 463, in _interImagesCopy raise se.CopyImageError() CopyImageError: low level Image copy failed: () 222fc76c-de1b-4220-86b8-4fe629768fa1::DEBUG::2015-11-16 14:59:44,479::task::885::Storage.TaskManager.Task::(_run) Task=`222fc76c-de1b-4220-86b8-4fe629768fa1`::Task._run: 222fc76c-de1b-4220-86b8-4fe629768fa1 () {} failed - stopping task " Version: vdsm-4.17.10.1-0.el7ev.noarch qemu-img-rhev-2.3.0-31.el7_2.1.x86_64 @Nir - what do you think? An issue in qemu?
(In reply to Daniel Erez from comment #3) It may be the same problem since this error appears in my log as well, check few lines above to see if the errors I quoted appear as well The versions I'm using: vddm-4.17.999-50.git67f4b2b.f22 qemu-img-2.4.0-2.fc22
(In reply to Daniel Erez from comment #3) > Hi Arik, > > Which versions of vdsm/qemu are you using? I've tried to reproduce the issue > and getting a different error: ... > CopyImageError: low level Image copy failed: () ... > @Nir - what do you think? An issue in qemu? There is not enough info in this error to tell anything. Check the error of the qemu-img command.
Hi Kevin, It seems we're getting 'Image is not in qcow2 format' error [1], after invoking 'qemu-img convert' [2]. Now the file format is indeed raw ('qemu-img info' shows that). The thing is that it worked fine on earlier version (qemu-img-rhev-0.12.1.2-2.448.el6_6.x86_64) - probably just silently ignored? While it fails on a newer version (qemu-img-rhev-2.3.0-31.el7_2.1.x86_64). Was there any change between those versions that might lead to it? Thanks! [1] QImgError: ecode=1, stdout=[], stderr=["qemu-img: Could not open '/rhev/data-center/00000001-0001-0001-0001-00000000001d/b4889450-5b53-481a-ae9a-f63decac46de/images/258ff498-e006-4982-b981-2fa2595d6604/8ce3a35b-3b22-4deb-ab57-3bb56b49c6fa': Image is not in qcow2 format"], message=None [2] /usr/bin/qemu-img convert -t none -T none -f qcow2 /rhev/data-center/00000001-0001-0001-0001-00000000001d/b4889450-5b53-481a-ae9a-f63decac46de/images/258ff498-e006-4982-b981-2fa2595d6604/8ce3a35b-3b22-4deb-ab57-3bb56b49c6fa -O qcow2 -o compat=0.10 /rhev/data-center/mnt/derez1.usersys:_home_data_export1/12133f75-b1ab-4f00-b7ef-87c2c0f235be/images/258ff498-e006-4982-b981-2fa2595d6604/8ce3a35b-3b22-4deb-ab57-3bb56b49c6fa
Let me see if I understand this correctly: 1. qemu-img info shows format qcow2 for the source image 2. qemu-img convert is used to copy from source to destination, both qcow2 3. qemu-img info shows format raw for the source image Note that qemu-img convert opens the source file read-only, so if you see what I described above, can you confirm that no other action was performed between 1. and 3.? This looks rather unlikely. Is the destination image detected as raw as well or is it correctly qcow2? Can you post a hexdump of the first 512 bytes? (hexdump -C -n 512 <path>)
(In reply to Kevin Wolf from comment #7) Our question is different. We try to convert an image using qemu-img convert. We specify that the image is qcow2 although qemu-img info shows that it is actually raw. qemu-img-rhev-0.12.1.2-2.448.el6_6.x86_64 is willing to convert the image. qemu-img-rhev-2.3.0-31.el7_2.1.x86_64 fails with (a correct) error of "Image is not in qcow2 format". We wonder how comes that the previous version succeeded and the newer fails. Did you add a missing validation in qemu-img to check that?
(In reply to Arik from comment #8) > (In reply to Kevin Wolf from comment #7) > Our question is different. > > We try to convert an image using qemu-img convert. We specify that the image > is qcow2 although qemu-img info shows that it is actually raw. > > qemu-img-rhev-0.12.1.2-2.448.el6_6.x86_64 is willing to convert the image. > qemu-img-rhev-2.3.0-31.el7_2.1.x86_64 fails with (a correct) error of "Image > is not in qcow2 format". > > We wonder how comes that the previous version succeeded and the newer fails. > Did you add a missing validation in qemu-img to check that? Turns out that on older versions (3.5), we simply used 'dd' to copy qcow2 images, hence, there was no error (since the image was mistakenly been identified as cow...). Now, on new versions, we use 'qemu-img convert' both for raw and cow. So, is it fine to simply remove the source format parameter ('-f qcow2')? I.e. so qemu-img could identify the source format automatically.
Hi Kevin, I've tried to run qemu-img convert on the VM metadata file, which is a 10kb raw image, and the process just hangs. I.e. Input: /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/qemu-img convert -p -t none -T none /rhev/data-center/47cb1d79-872b-4c78-bd62-8179069b85c2/7956f54a-1a71-4a5e-8229-0edef6b175e0/images/f00f914c-3d78-48b6-bb91-6735cfcf5eb1/166a5d45-b022-47d5-a811-02f2cb3dffcf -O raw /rhev/data-center/mnt/derez1.usersys:_home_data_export2/82b34343-5b4c-47a9-b2ca-3d54f9cca2c3/images/f00f914c-3d78-48b6-bb91-6735cfcf5eb1/166a5d45-b022-47d5-a811-02f2cb3dffcf Output: (0.00/100%) * strace on the process is an infinite loop of: lseek(7, 5120, SEEK_DATA) = 5120 lseek(7, 5120, SEEK_HOLE) = 5474 lseek(7, 5120, SEEK_DATA) = 5120 lseek(7, 5120, SEEK_HOLE) = 5474 lseek(7, 5120, SEEK_DATA) = 5120 lseek(7, 5120, SEEK_HOLE) = 5474 lseek(7, 5120, SEEK_DATA) = 5120 lseek(7, 5120, SEEK_HOLE) = 5474 lseek(7, 5120, SEEK_DATA) = 5120 lseek(7, 5120, SEEK_HOLE) = 5474 lseek(7, 5120, SEEK_DATA) = 5120 BTW, I guessed it's related to [1], but a 101kb lead to the same result... What do you think? [1] https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1410288
Created attachment 1095737 [details] convert stuck strace
Created attachment 1095738 [details] convert works strace
After some further testing, convert does work only when executed directly from source location. I.e. /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/qemu-img convert -p -t none -T none 166a5d45-b022-47d5-a811-02f2cb3dffcf -O raw /rhev/data-center/mnt/derez1.usersys:_home_data_export2/82b34343-5b4c-47a9-b2ca-3d54f9cca2c3/images/f00f914c-3d78-48b6-bb91-6735cfcf5eb1/166a5d45-b022-47d5-a811-02f2cb3dffcf (100.00/100%) * strace logs of both scenarios are attached.
(In reply to Daniel Erez from comment #9) > Turns out that on older versions (3.5), we simply used 'dd' to copy qcow2 > images, hence, there was no error (since the image was mistakenly been > identified as cow...). Now, on new versions, we use 'qemu-img convert' both > for raw and cow. So, is it fine to simply remove the source format parameter > ('-f qcow2')? I.e. so qemu-img could identify the source format > automatically. No, you always need to specify -f. For the occasional manual use case where you know the image, omitting it and relying on probing is fine, but you must never do that in management software. A raw image could start with a qcow2 header (after all, the guest can write anything it wants to it) and you still want it to be treated as raw. (In reply to Daniel Erez from comment #10) > BTW, I guessed it's related to [1], but a 101kb lead to the same result... > > What do you think? > > [1] https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1410288 The Ubuntu bug says that it's fixed in qemu 2.2, so it's probably different. I tried to reproduce and indeed this hangs: $ dd if=/dev/zero of=/tmp/test.raw bs=5474 count=1 $ ./qemu-img convert -p -t none -T none -O raw /tmp/test.raw /tmp/dest.raw The reason seems to be that the source file size isn't aligned to a sector boundary like a valid raw image would be. In upstream it does work (with the destination image size rounded up to the next full sector), but the lesson to learn is that you should only use qemu-img convert with disk images, never with random other files. We will automatically get the upstream fix in qemu-kvm-rhev 7.3 as we rebase, but if you need it in 7.2.z, please clone the Fedora bug (bug 1229394). I'll already clone the bug for plain RHEL qemu-kvm because the bug exists there as well and I can't rely on a rebase there.
(In reply to Kevin Wolf from comment #14) > $ dd if=/dev/zero of=/tmp/test.raw bs=5474 count=1 > $ ./qemu-img convert -p -t none -T none -O raw /tmp/test.raw /tmp/dest.raw > > The reason seems to be that the source file size isn't aligned to a sector > boundary like a valid raw image would be. In upstream it does work (with the > destination image size rounded up to the next full sector), but the lesson to > learn is that you should only use qemu-img convert with disk images, never > with > random other files. This image was created using qemu-img create: qemu-img create -f qcow2 foo 10240 Then its contents was replaced by writing vdsm metadata: with open('foo', 'w') as f: f.write(data) This call truncate the file and write new data. Does it change the alignment of the original image? Should we use directio io instead when writing raw image data? For example (using pseudo code): data += padding # make it multiple of 512 bytes cat data | dd of=foo oflag=direct I know that creating a qcow2 image when we want raw image is lame, we are fixing this.
Do you actually pass this as a disk to a guest? If no, you shouldn't be using qemu-img at all, because it's a tool for disk images, not for random files. But if you must, just make sure that you don't change the file size, i.e. open the file in a mode that doesn't truncate ("r+" for Python's open(), I guess; conv=notrunc for dd) and make sure that the written data isn't larger than the image file already is.
(In reply to Kevin Wolf from comment #16) > Do you actually pass this as a disk to a guest? No, this is a metadata file that the guest will never see.
Thanks Kevin! I've cloned the bug to qemu-kvm-rhev: https://bugzilla.redhat.com/show_bug.cgi?id=1283278
Verified both export and import VM with RAM snapshot - both work. vdsm-4.17.15-0.el7ev.noarch rhevm-3.6.2-0.1.el6.noarch qemu-img-rhev-2.3.0-31.el7_2.5.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64
RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE