RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2072379 - Fail to rebuild the reference count tables of qcow2 image on host block devices (e.g. LVs)
Summary: Fail to rebuild the reference count tables of qcow2 image on host block devic...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: All
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Hanna Czenczek
QA Contact: Tingting Mao
URL:
Whiteboard:
Depends On: 1519071
Blocks: 2072242
TreeView+ depends on / blocked
 
Reported: 2022-04-06 07:54 UTC by Tingting Mao
Modified: 2022-11-15 10:19 UTC (History)
21 users (show)

Fixed In Version: qemu-kvm-7.0.0-6.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1519071
Environment:
Last Closed: 2022-11-15 09:54:42 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src qemu-kvm merge_requests 96 0 None opened qcow2: Improve refcount structure rebuilding 2022-05-30 12:02:45 UTC
Red Hat Issue Tracker RHELPLAN-118004 0 None None None 2022-04-06 08:09:00 UTC
Red Hat Product Errata RHSA-2022:7967 0 None None None 2022-11-15 09:55:24 UTC

Description Tingting Mao 2022-04-06 07:54:09 UTC
+++ This bug was initially created as a clone of Bug #1519071 +++

Description of problem:
Create an LV image with lazy_refcounts=on and install guest with cache=writethrough; After installation finished, write file inside guest and kill the qemu process; After that, check the image, "qemu-img check -r all" reports lots of errors.

Version-Release number of selected component (if applicable):
qemu-kvm-6.2.0-12.el9
kernel-5.14.0-75.el9.x86_64


How reproducible: 
100%


Steps to Reproduce:
1. Prepare a LV as below
# qemu-img create -f raw loop.img 50G
# losetup /dev/loop1 /home/timao/test/loop.img
# pvcreate /dev/loop1
# vgcreate vgroup /dev/loop1
# lvcreate -L 30G -n lv vgroup

2. Convert a installed well qcow2 to the lv above
# qemu-img check -r all RHEL-8.6-x86_64-latest.qcow2 
No errors were found on the image.
31054/163840 = 18.95% allocated, 92.09% fragmented, 90.65% compressed clusters
Image end offset: 966262784

# qemu-img convert -f qcow2 -O qcow2 -o lazy_refcounts=on,compat=1.1 RHEL-8.6-x86_64-latest.qcow2 /dev/vgroup/lv -p

# qemu-img info /dev/vgroup/lv 
image: /dev/vgroup/lv
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: true
    refcount bits: 16
    corrupt: false
    extended l2: false

3. Boot up a guest from the lv
# /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 15360  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'Haswell-noTSX',+kvm_pv_unhalt \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:1c:0c:0d:e3:4c,id=idjmZXQS,netdev=idEFQ4i1,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idEFQ4i1,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -monitor stdio \
    -device pcie-root-port,id=pcie-root-port-5,port=0x6,addr=0x1.0x5,bus=pcie.0,chassis=5 \
    -device virtio-scsi-pci,id=virtio_scsi_pci2,bus=pcie-root-port-5,addr=0x0 \
    -blockdev node-name=file_image1,driver=host_device,auto-read-only=on,discard=unmap,aio=threads,filename=/dev/vgroup/lv,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=off,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=off \
    -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \

4. dd file and get md5 value inside guest with sync
(guest)# dd if=/dev/urandom of=file1 conv=fsync bs=1M count=512 ; md5sum file1 ; sync

5. Kill qemu-kvm process in host immediately after the dd finished in last step.
# kill -9 `pidof qemu-kvm`

6. Check the lv image file
# qemu-img check -r all /dev/vgroup/lv 
ERROR cluster 33067 refcount=0 reference=1
ERROR cluster 33068 refcount=0 reference=1
ERROR cluster 33069 refcount=0 reference=1
ERROR cluster 33070 refcount=0 reference=1
ERROR cluster 33071 refcount=0 reference=1
ERROR cluster 33072 refcount=0 reference=1
......
......
ERROR cluster 40299 refcount=0 reference=1
ERROR cluster 40300 refcount=0 reference=1
Rebuilding refcount structure
ERROR writing refblock: No space left on device
qemu-img: Check failed: No space left on device

Comment 5 Yanan Fu 2022-06-13 09:54:54 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 8 Tingting Mao 2022-06-16 09:50:51 UTC
Verified this bug as below.


Tested with:
qemu-kvm-7.0.0-6.el9
kernel-5.14.0-96.el9.x86_64


Steps:
1. Prepare a LV as below
# qemu-img create -f raw loop.img 50G
# losetup /dev/loop1 /home/timao/test/loop.img
# pvcreate /dev/loop1
# vgroup vgroup /dev/loop1
# lvcreate -L 30G -n lv vgroup

2. Convert a installed well qcow2 to the lv above
# qemu-img check -r all RHEL-8.6-x86_64-latest.qcow2 
No errors were found on the image.
30848/163840 = 18.83% allocated, 91.29% fragmented, 89.69% compressed clusters
Image end offset: 985595904

# qemu-img convert -f qcow2 -O qcow2 -o lazy_refcounts=on,compat=1.1 RHEL-8.6-x86_64-latest.qcow2 /dev/vgroup/lv -p

# qemu-img info /dev/vgroup/lv 
image: /dev/vgroup/lv
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: true
    refcount bits: 16
    corrupt: false
    extended l2: false

3. Boot up a guest from the lv
# /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 15360  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'Haswell-noTSX',+kvm_pv_unhalt \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:1c:0c:0d:e3:4c,id=idjmZXQS,netdev=idEFQ4i1,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idEFQ4i1,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -monitor stdio \
    -device pcie-root-port,id=pcie-root-port-5,port=0x6,addr=0x1.0x5,bus=pcie.0,chassis=5 \
    -device virtio-scsi-pci,id=virtio_scsi_pci2,bus=pcie-root-port-5,addr=0x0 \
    -blockdev node-name=file_image1,driver=host_device,auto-read-only=on,discard=unmap,aio=threads,filename=/dev/vgroup/lv,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=off,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=off \
    -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \

4. dd file and get md5 value inside guest with sync
(guest)# dd if=/dev/urandom of=file1 conv=fsync bs=1M count=512 ; md5sum file1 ; sync

5. Kill qemu-kvm process in host immediately after the dd finished in last step.
# kill -9 `pidof qemu-kvm`

6. Check the lv image file
# qemu-img check -r all /dev/vgroup/lv 
......
......
ERROR cluster 48276 refcount=0 reference=1
ERROR cluster 48277 refcount=0 reference=1
Rebuilding refcount structure
Repairing cluster 1 refcount=1 reference=0
Repairing cluster 2 refcount=1 reference=0
Repairing cluster 32774 refcount=1 reference=0
The following inconsistencies were found and repaired:

    0 leaked clusters
    7039 corruptions

Double checking the fixed image now...
No errors were found on the image.
48260/163840 = 29.46% allocated, 2.25% fragmented, 0.00% compressed clusters
Image end offset: 3164143616


Results:
As above, 'check -r all' fixed the image.

Comment 10 errata-xmlrpc 2022-11-15 09:54:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7967


Note You need to log in before you can comment on or make changes to this bug.