1984852 – [CBT] Use --skip-broken-bitmaps in qemu-img convert --bitmaps to avoid failure if a bitmap is inconsistent

Bug 1984852 - [CBT] Use --skip-broken-bitmaps in qemu-img convert --bitmaps to avoid failure if a bitmap is inconsistent

Summary: [CBT] Use --skip-broken-bitmaps in qemu-img convert --bitmaps to avoid failur...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	vdsm
Classification:	oVirt
Component:	General
Sub Component:
Version:	4.40.60.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ovirt-4.4.9
Target Release:	---
Assignee:	Eyal Shenitzky
QA Contact:	Amit Sharir
Docs Contact:
URL:
Whiteboard:
Depends On:	1946084
Blocks:
TreeView+	depends on / blocked

Reported:	2021-07-22 10:58 UTC by Nir Soffer
Modified:	2021-12-23 16:34 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2021-10-21 07:27:14 UTC
oVirt Team:	Storage
Embargoed:
Dependent Products:
Flags:	pm-rhel: ovirt-4.4+ asharir: testing_plan_complete+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
oVirt gerrit	116237	master	MERGED	qemuimg: use '--skip-broken-bitmap' in qemu-img convert	2021-09-30 18:04:35 UTC
oVirt gerrit	116885	master	MERGED	spec: update qemu-kvm requirement	2021-09-30 18:04:33 UTC
oVirt gerrit	116916	ovirt-4.4.z	MERGED	qemuio.py: Introduce abort() helper to crash QEMU process	2021-10-04 07:13:30 UTC
oVirt gerrit	116917	ovirt-4.4.z	MERGED	spec: update qemu-kvm requirement	2021-10-04 07:13:32 UTC
oVirt gerrit	116918	ovirt-4.4.z	MERGED	qemuimg: use '--skip-broken-bitmap' in qemu-img convert	2021-10-04 07:13:37 UTC

Description Nir Soffer 2021-07-22 10:58:40 UTC

Description of problem:

If a bitmap becomes inconsistent after abnormal vm termination, copying the 
volume with the inconsistent bitmap using the --bitmaps option will fail with:

    qemu-img: Failed to populate bitmap 5f59b2d6-6b52-484c-ae7a-f8b43f2175a4: 
    Bitmap \'5f59b2d6-6b52-484c-ae7a-f8b43f2175a4\' is inconsistent and cannot
    be used\nTry block-dirty-bitmap-remove to delete this bitmap from disk"

This will fail the storage job, failing move disk operation (both cold and live)
and possibly other operations (bug 1946084).

qemu-img added --skip-broken-bitmaps option to skip inconsistent bitmaps and
avoid such failures. This feature will be available in qemu 6.1.0,  and we hope
to get it also in a future RHEL AV 8.4.z update.

We can add the code to use this now by detecting if qemu-img convert supports
this option, and always use --skip-broken-bitmaps when using the --bitmaps option.

Once we have a fix in qemu-img, we can require the fixed version in the spec, but
this may happen only in 4.4.9.

Reproducing the issue:

1. Start vm with qcow2 disk
2. Create full backup
3. Kill the qemu process (kill -9 qemu-pid)
4. Try to move the disk to another storage (cold and live)

Actual result:
Move disk fails with the mentioned error in qemu-img convert.

Testing the fix:
With fixed version moving the disk will succeed.

Testing require qemu-img version supporting --skip-broken-bitmaps. This version
will be available soon in RHEL AV 8.5 nightly builds and in Centos Stream.

This failure is basically a regression in the version we started to support copying
bitmaps using --bitmaps (I think this is 4.4.6). Before that we did not copy bitmaps,
so after moving a disk full backup was required. In current release, move disk is
not possible without removing the inconsistent bitmap. Deleting the related checkpoint
should remove the bitmap and fix this issue.

Comment 1 Amit Sharir 2021-10-11 12:43:30 UTC

Hi Nir, 

Have a few questions regarding the verification flow. 

Do I need to do steps 2 and 3 simoultainosly? 

For step 3 I did the command: "ps ax | grep qemu" - do I simply need to kill the first process that appears there? 

Thanks.

Comment 2 Eyal Shenitzky 2021-10-11 13:07:21 UTC

(In reply to Amit Sharir from comment #1)
> Hi Nir, 
> 
> Have a few questions regarding the verification flow. 
> 
> Do I need to do steps 2 and 3 simoultainosly? 
> 
> For step 3 I did the command: "ps ax | grep qemu" - do I simply need to kill
> the first process that appears there? 
> 
> Thanks.

No, you can just start a live backup, after it was started, power-off the VM outside the guest (can use REST-API with 'force' option), It is enough to corrupt the bitmap.

Comment 3 Nir Soffer 2021-10-11 13:09:04 UTC

(In reply to Amit Sharir from comment #1)
> Hi Nir, 
> 
> Have a few questions regarding the verification flow. 
> 
> Do I need to do steps 2 and 3 simoultainosly? 

No, you should wait until the full backup is completed.

> For step 3 I did the command: "ps ax | grep qemu" - do I simply need to kill
> the first process that appears there? 

No, you can have other qemu instances on the host, you need to kill the right one.

You can detect the right qemu instance by grepping the vm name, which should be 
unique.

Comment 4 Eyal Shenitzky 2021-10-11 13:09:41 UTC

To make sure that the bitmaps got corrupted you can run - 

qemu-img info <path-to-volume>

And see that the bitmaps are having a 'in-use' flag.

Comment 5 Nir Soffer 2021-10-11 13:12:54 UTC

(In reply to Eyal Shenitzky from comment #2)
> No, you can just start a live backup, after it was started, power-off the VM
> outside the guest (can use REST-API with 'force' option), It is enough to
> corrupt the bitmap.

Sutting down the guest via API or virsh will *not* corrupt any bitmap. qemu will
corrupt bitmaps. Qemu persist bitmaps to storage when terminated normally.

The best way to corrupt the bitmap is to kill qemu with SIGKILL since it cannot
handle this signal.

Comment 8 Amit Sharir 2021-10-12 09:47:02 UTC

Version: 
vdsm-4.40.90.2-1.el8ev.x86_64
ovirt-engine-4.4.9.1-0.13.el8ev.noarch


Verification Flow: 
1. Create a VM with qcow2 disk.
2. Start VM.
3. Create a full backup using "python3 backup_vm.py -c engine full <vm-id>"
4. Find the relevant QEMU process using "ps ax | grep guest=<VM name of the machine that was created in step1>" on the host the VM is running on.
5. Kill the QEMU process using "kill -9 qemu-pid"
6. Move the disk to another storage (cold and live) - done via UI.

Verification Conclusions:
The expected output matched the actual output.
The total flow mentioned was done with no errors, I was able to move the disk after completing the mentioned flow without any issues.


Bug verified.

Comment 9 Nir Soffer 2021-10-12 10:18:36 UTC

(In reply to Amit Sharir from comment #8)
> Verification Flow: 
...

Sounds correct, but did you verify the original issue with vdsm version
that does not use --skip-broken-bitmaps?

Comment 10 Amit Sharir 2021-10-12 11:09:07 UTC

I just did the flow I mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1984852#c8 and saw that I could move the disk without any issues. 
Do I need to check additional functionality in order to verify this bug?

Comment 11 Nir Soffer 2021-10-12 11:17:28 UTC

(In reply to Amit Sharir from comment #10)
> I just did the flow I mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1984852#c8 and saw that I could
> move the disk without any issues. 
> Do I need to check additional functionality in order to verify this bug?

If you don't reproduce the issue before testing, how do you know if your test
was correct?

You should be able to downgrade vdsm to older version that did not support 
--skip-broken-bitmap (e.g from 4.4.8), or run the  same test on an older environment
and reproduce the failure to move or copy disk in the same flow.

Comment 12 Amit Sharir 2021-10-12 14:37:19 UTC

I was sure it was already reproduced in the same way. 



Just ran the flow mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1984852#c8 on version vdsm-4.40.80.2-1.el8ev.x86_64 / ovirt-engine-4.4.8.1-0.9.el8ev.noarch and was able to complete the whole flow with no errors. Meaning that the verification flow mentioned was not valid. Do you have another way I can reproduce this? Maybe how Eyal suggested? 

Returning this to "on_qa" until this will be clarified.

Comment 13 Nir Soffer 2021-10-13 00:16:07 UTC

(In reply to Amit Sharir from comment #12)
> Just ran the flow mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1984852#c8 on version
> vdsm-4.40.80.2-1.el8ev.x86_64 / ovirt-engine-4.4.8.1-0.9.el8ev.noarch and
> was able to complete the whole flow with no errors. Meaning that the
> verification flow mentioned was not valid.

The flow described in comment 0 was incorrect. This is why we need to
always reproduce the issue.

The bitmap created during the backup is written to the disk only during 
shutdown, so killing the VM will not leave a broken bitmap in the disk,
the disk will not have any bitmap, so moving the disk does not fail.

We need to kill a VM which already had a bitmap before we started it.

Try should work:

1. Create VM with qcow2 disk on file storage
   (Testing on file storage is easier)
2. Start the VM
3. Perform full backup
4. Stop the VM normally (e.g. from the UI)
5. Check that the disk has the expected bitmaps

Find the disk snapshot UUID in the UI
VMs -> snpashots -> Active VM -> disks

Find the volume file:

# ls /rhev/data-center/mnt/*/*/images/*/d0bc130c-c908-4258-871b-88ad16bfd072
/rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072

# qemu-img info /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
image: /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 52.6 MiB
cluster_size: 65536
backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path: /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    bitmaps:
        [0]:
            flags:
                [0]: auto
            name: b99cd4e4-b90f-4d78-95a5-04dec106634e
            granularity: 65536
    refcount bits: 16
    corrupt: false
    extended l2: false

6. Start the VM again
7. Find qemu pid
8. Kill qemu
9. Check the disk again

# qemu-img info /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
image: /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 72.9 MiB
cluster_size: 65536
backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path: /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    bitmaps:
        [0]:
            flags:
                [0]: in-use
                [1]: auto
            name: b99cd4e4-b90f-4d78-95a5-04dec106634e
            granularity: 65536
    refcount bits: 16
    corrupt: false
    extended l2: false

We have a broken bitmap with the "in-use" flag.

10. Try to move the disk to another storage domain
    (another NFS domain in this example).

In 4.4.8 this will fail in vdsm with an error about broken bitmaps.

In 4.4.9 this will succeed.

11. Check the moved disk

Find the disk again:

# ls /rhev/data-center/mnt/*/*/images/*/d0bc130c-c908-4258-871b-88ad16bfd072
/rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072

# qemu-img info /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
image: /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 79.6 MiB
cluster_size: 65536
backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path: /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false


The broken bitmap was skipped.

To live disk move, you need to perform the entire flow again to create
a new broken bitmap, and start the VM before you move the disk.

Comment 14 Amit Sharir 2021-10-14 09:41:25 UTC

> Try should work:
> 
> 1. Create VM with qcow2 disk on file storage
>    (Testing on file storage is easier)
> 2. Start the VM
> 3. Perform full backup
> 4. Stop the VM normally (e.g. from the UI)
> 5. Check that the disk has the expected bitmaps
> 
> Find the disk snapshot UUID in the UI
> VMs -> snpashots -> Active VM -> disks
> 
> Find the volume file:
> 
> # ls /rhev/data-center/mnt/*/*/images/*/d0bc130c-c908-4258-871b-88ad16bfd072
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> 
> # qemu-img info
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> image:
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> file format: qcow2
> virtual size: 6 GiB (6442450944 bytes)
> disk size: 52.6 MiB
> cluster_size: 65536
> backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path:
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
> backing file format: qcow2
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: false
>     bitmaps:
>         [0]:
>             flags:
>                 [0]: auto
>             name: b99cd4e4-b90f-4d78-95a5-04dec106634e
>             granularity: 65536
>     refcount bits: 16
>     corrupt: false
>     extended l2: false
> 
> 6. Start the VM again
> 7. Find qemu pid
> 8. Kill qemu
> 9. Check the disk again
> 
> # qemu-img info
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> image:
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> file format: qcow2
> virtual size: 6 GiB (6442450944 bytes)
> disk size: 72.9 MiB
> cluster_size: 65536
> backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path:
> /rhev/data-center/mnt/alpine:_00/8ece2aae-5c72-4a5c-b23b-74bae65c88e1/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
> backing file format: qcow2
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: false
>     bitmaps:
>         [0]:
>             flags:
>                 [0]: in-use
>                 [1]: auto
>             name: b99cd4e4-b90f-4d78-95a5-04dec106634e
>             granularity: 65536
>     refcount bits: 16
>     corrupt: false
>     extended l2: false
> 
> We have a broken bitmap with the "in-use" flag.
> 
> 10. Try to move the disk to another storage domain
>     (another NFS domain in this example).
> 
> In 4.4.8 this will fail in vdsm with an error about broken bitmaps.
> 
> In 4.4.9 this will succeed.
> 
> 11. Check the moved disk
> 
> Find the disk again:
> 
> # ls /rhev/data-center/mnt/*/*/images/*/d0bc130c-c908-4258-871b-88ad16bfd072
> /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> 
> # qemu-img info
> /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> image:
> /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/d0bc130c-c908-4258-871b-88ad16bfd072
> file format: qcow2
> virtual size: 6 GiB (6442450944 bytes)
> disk size: 79.6 MiB
> cluster_size: 65536
> backing file: 9d63f782-7467-4243-af1e-5c1f8b49c111 (actual path:
> /rhev/data-center/mnt/alpine:_01/f07583a1-03d5-4716-9fb0-7dc5c347371a/images/
> 82fa5b89-bcfa-4de0-bc8b-834e65d97122/9d63f782-7467-4243-af1e-5c1f8b49c111)
> backing file format: qcow2
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
>     extended l2: false
> 
> 
> The broken bitmap was skipped.
> 
> To live disk move, you need to perform the entire flow again to create
> a new broken bitmap, and start the VM before you move the disk.





I had 100% success in reproducing the issue using the flow mentioned on version: ovirt-engine-4.4.8.6-0.1.el8ev.noarch / vdsm-4.40.80.6-1.el8ev.x86_64

Comment 15 Amit Sharir 2021-10-14 09:42:20 UTC

Version: 
vdsm-4.40.90.2-1.el8ev.x86_64
ovirt-engine-4.4.9.1-0.13.el8ev.noarch


Verification Flow:
As mentioned by Nir in https://bugzilla.redhat.com/show_bug.cgi?id=1984852#c13 - did both flows for cold and live disk move (including bitmap validations along the flow).


Verification Conclusions:
The expected output matched the actual output.
The total flow mentioned was done with no errors, I was able to move the disk after completing the mentioned flow without any issues.


Bug verified.

Comment 16 Sandro Bonazzola 2021-10-21 07:27:14 UTC

This bugzilla is included in oVirt 4.4.9 release, published on October 20th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.9 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.