Bug 1775462

Summary:	Creating luks-inside-qcow2 images with cluster_size=2k/4k will get a corrupted image
Product:	Red Hat Enterprise Linux Advanced Virtualization	Reporter:	Tingting Mao <timao>
Component:	qemu-kvm	Assignee:	Daniel Berrangé <berrange>
qemu-kvm sub component:	General	QA Contact:	Xueqiang Wei <xuwei>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	areis, berrange, coli, ehadley, jinzhao, juzhang, qzhang, virt-maint, zhenyzha
Version:	8.2	Keywords:	Regression, Triaged
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-07-28 07:12:15 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Tingting Mao 2019-11-22 03:00:51 UTC

Description of problem:
As subject. For luks-inside-qcow2 images with remain cluster_size, like 512, 1k, ...2M , there is no the issue. And for plain qcow2 without encryption, there is no the issue. 


Version-Release number of selected component (if applicable):
qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb
kernel-4.18.0-148.el8.x86_64


How reproducible:
100%


Steps to Reproduce:
Scenario 1(With cluster_size=2k)
# qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=2k,encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G

# qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}'
ERROR: counting reference for region exceeding the end of the file by one cluster or more: offset 0x5800 size 0x1f9000
Leaked cluster 11 refcount=1 reference=0
Leaked cluster 12 refcount=1 reference=0
Leaked cluster 13 refcount=1 reference=0
Leaked cluster 14 refcount=1 reference=0
Leaked cluster 15 refcount=1 reference=0
Leaked cluster 16 refcount=1 reference=0
Leaked cluster 17 refcount=1 reference=0
Leaked cluster 18 refcount=1 reference=0
Leaked cluster 19 refcount=1 reference=0
Leaked cluster 20 refcount=1 reference=0
Leaked cluster 21 refcount=1 reference=0
Leaked cluster 22 refcount=1 reference=0
Leaked cluster 23 refcount=1 reference=0
Leaked cluster 24 refcount=1 reference=0
Leaked cluster 25 refcount=1 reference=0
Leaked cluster 26 refcount=1 reference=0
Leaked cluster 27 refcount=1 reference=0
Leaked cluster 28 refcount=1 reference=0
Leaked cluster 29 refcount=1 reference=0
Leaked cluster 30 refcount=1 reference=0
Leaked cluster 31 refcount=1 reference=0
Leaked cluster 32 refcount=1 reference=0
Leaked cluster 33 refcount=1 reference=0
Leaked cluster 34 refcount=1 reference=0
Leaked cluster 35 refcount=1 reference=0
Leaked cluster 36 refcount=1 reference=0
Leaked cluster 37 refcount=1 reference=0
Leaked cluster 38 refcount=1 reference=0
Leaked cluster 39 refcount=1 reference=0
Leaked cluster 40 refcount=1 reference=0
Leaked cluster 41 refcount=1 reference=0
Leaked cluster 42 refcount=1 reference=0
Leaked cluster 43 refcount=1 reference=0
Leaked cluster 44 refcount=1 reference=0
Leaked cluster 45 refcount=1 reference=0
Leaked cluster 46 refcount=1 reference=0
Leaked cluster 47 refcount=1 reference=0
Leaked cluster 48 refcount=1 reference=0
Leaked cluster 49 refcount=1 reference=0
Leaked cluster 50 refcount=1 reference=0
Leaked cluster 51 refcount=1 reference=0
Leaked cluster 52 refcount=1 reference=0
Leaked cluster 53 refcount=1 reference=0
Leaked cluster 54 refcount=1 reference=0
Leaked cluster 55 refcount=1 reference=0
Leaked cluster 56 refcount=1 reference=0
Leaked cluster 57 refcount=1 reference=0
Leaked cluster 58 refcount=1 reference=0
Leaked cluster 59 refcount=1 reference=0
Leaked cluster 60 refcount=1 reference=0
Leaked cluster 61 refcount=1 reference=0
Leaked cluster 62 refcount=1 reference=0
Leaked cluster 63 refcount=1 reference=0
Leaked cluster 64 refcount=1 reference=0
Leaked cluster 65 refcount=1 reference=0
Leaked cluster 66 refcount=1 reference=0
Leaked cluster 67 refcount=1 reference=0
Leaked cluster 68 refcount=1 reference=0
Leaked cluster 69 refcount=1 reference=0
Leaked cluster 70 refcount=1 reference=0
Leaked cluster 71 refcount=1 reference=0
Leaked cluster 72 refcount=1 reference=0
Leaked cluster 73 refcount=1 reference=0
Leaked cluster 74 refcount=1 reference=0
Leaked cluster 75 refcount=1 reference=0
Leaked cluster 76 refcount=1 reference=0
Leaked cluster 77 refcount=1 reference=0
Leaked cluster 78 refcount=1 reference=0
Leaked cluster 79 refcount=1 reference=0
Leaked cluster 80 refcount=1 reference=0
Leaked cluster 81 refcount=1 reference=0
Leaked cluster 82 refcount=1 reference=0
Leaked cluster 83 refcount=1 reference=0
Leaked cluster 84 refcount=1 reference=0
Leaked cluster 85 refcount=1 reference=0
Leaked cluster 86 refcount=1 reference=0
Leaked cluster 87 refcount=1 reference=0
Leaked cluster 88 refcount=1 reference=0
Leaked cluster 89 refcount=1 reference=0
Leaked cluster 90 refcount=1 reference=0
Leaked cluster 91 refcount=1 reference=0
Leaked cluster 92 refcount=1 reference=0
Leaked cluster 93 refcount=1 reference=0
Leaked cluster 94 refcount=1 reference=0
Leaked cluster 95 refcount=1 reference=0
Leaked cluster 96 refcount=1 reference=0
Leaked cluster 97 refcount=1 reference=0
Leaked cluster 98 refcount=1 reference=0
Leaked cluster 99 refcount=1 reference=0
Leaked cluster 100 refcount=1 reference=0
Leaked cluster 101 refcount=1 reference=0
Leaked cluster 102 refcount=1 reference=0
Leaked cluster 103 refcount=1 reference=0
Leaked cluster 104 refcount=1 reference=0
Leaked cluster 105 refcount=1 reference=0
Leaked cluster 106 refcount=1 reference=0
Leaked cluster 107 refcount=1 reference=0
Leaked cluster 108 refcount=1 reference=0
Leaked cluster 109 refcount=1 reference=0
Leaked cluster 110 refcount=1 reference=0
Leaked cluster 111 refcount=1 reference=0
Leaked cluster 112 refcount=1 reference=0
Leaked cluster 113 refcount=1 reference=0
Leaked cluster 114 refcount=1 reference=0
Leaked cluster 115 refcount=1 reference=0
Leaked cluster 116 refcount=1 reference=0
Leaked cluster 117 refcount=1 reference=0
Leaked cluster 118 refcount=1 reference=0
Leaked cluster 119 refcount=1 reference=0
Leaked cluster 120 refcount=1 reference=0
Leaked cluster 121 refcount=1 reference=0
Leaked cluster 122 refcount=1 reference=0
Leaked cluster 123 refcount=1 reference=0
Leaked cluster 124 refcount=1 reference=0
Leaked cluster 125 refcount=1 reference=0
Leaked cluster 126 refcount=1 reference=0
Leaked cluster 127 refcount=1 reference=0
Leaked cluster 128 refcount=1 reference=0
Leaked cluster 129 refcount=1 reference=0
Leaked cluster 130 refcount=1 reference=0
Leaked cluster 131 refcount=1 reference=0
Leaked cluster 132 refcount=1 reference=0
Leaked cluster 133 refcount=1 reference=0
Leaked cluster 134 refcount=1 reference=0
Leaked cluster 135 refcount=1 reference=0
Leaked cluster 136 refcount=1 reference=0
Leaked cluster 137 refcount=1 reference=0

1 errors were found on the image.
Data may be corrupted, or further writes to the image may corrupt it.

127 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
Image end offset: 282624


Scenario 2(With cluster_size=4K)
# qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=4k,encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G

# qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}'
ERROR: counting reference for region exceeding the end of the file by one cluster or more: offset 0x4000 size 0x1f9000
Leaked cluster 4 refcount=1 reference=0
Leaked cluster 5 refcount=1 reference=0
Leaked cluster 6 refcount=1 reference=0
Leaked cluster 7 refcount=1 reference=0
Leaked cluster 8 refcount=1 reference=0
Leaked cluster 9 refcount=1 reference=0
Leaked cluster 10 refcount=1 reference=0
Leaked cluster 11 refcount=1 reference=0
Leaked cluster 12 refcount=1 reference=0
Leaked cluster 13 refcount=1 reference=0
Leaked cluster 14 refcount=1 reference=0
Leaked cluster 15 refcount=1 reference=0
Leaked cluster 16 refcount=1 reference=0
Leaked cluster 17 refcount=1 reference=0
Leaked cluster 18 refcount=1 reference=0
Leaked cluster 19 refcount=1 reference=0
Leaked cluster 20 refcount=1 reference=0
Leaked cluster 21 refcount=1 reference=0
Leaked cluster 22 refcount=1 reference=0
Leaked cluster 23 refcount=1 reference=0
Leaked cluster 24 refcount=1 reference=0
Leaked cluster 25 refcount=1 reference=0
Leaked cluster 26 refcount=1 reference=0
Leaked cluster 27 refcount=1 reference=0
Leaked cluster 28 refcount=1 reference=0
Leaked cluster 29 refcount=1 reference=0
Leaked cluster 30 refcount=1 reference=0
Leaked cluster 31 refcount=1 reference=0
Leaked cluster 32 refcount=1 reference=0
Leaked cluster 33 refcount=1 reference=0
Leaked cluster 34 refcount=1 reference=0
Leaked cluster 35 refcount=1 reference=0
Leaked cluster 36 refcount=1 reference=0
Leaked cluster 37 refcount=1 reference=0
Leaked cluster 38 refcount=1 reference=0
Leaked cluster 39 refcount=1 reference=0
Leaked cluster 40 refcount=1 reference=0
Leaked cluster 41 refcount=1 reference=0
Leaked cluster 42 refcount=1 reference=0
Leaked cluster 43 refcount=1 reference=0
Leaked cluster 44 refcount=1 reference=0
Leaked cluster 45 refcount=1 reference=0
Leaked cluster 46 refcount=1 reference=0
Leaked cluster 47 refcount=1 reference=0
Leaked cluster 48 refcount=1 reference=0
Leaked cluster 49 refcount=1 reference=0
Leaked cluster 50 refcount=1 reference=0
Leaked cluster 51 refcount=1 reference=0
Leaked cluster 52 refcount=1 reference=0
Leaked cluster 53 refcount=1 reference=0
Leaked cluster 54 refcount=1 reference=0
Leaked cluster 55 refcount=1 reference=0
Leaked cluster 56 refcount=1 reference=0
Leaked cluster 57 refcount=1 reference=0
Leaked cluster 58 refcount=1 reference=0
Leaked cluster 59 refcount=1 reference=0
Leaked cluster 60 refcount=1 reference=0
Leaked cluster 61 refcount=1 reference=0
Leaked cluster 62 refcount=1 reference=0
Leaked cluster 63 refcount=1 reference=0
Leaked cluster 64 refcount=1 reference=0
Leaked cluster 65 refcount=1 reference=0
Leaked cluster 66 refcount=1 reference=0
Leaked cluster 67 refcount=1 reference=0

1 errors were found on the image.
Data may be corrupted, or further writes to the image may corrupt it.

64 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
Image end offset: 278528


Actual results:
As above, the image is corrupted.


Expected results:
Image is well. There is no error in the image.


Additional info:
1. Tried with all the supported cluster_size via below script. Only 2k and 4k hit the issue.

#!/bin/bash

for ((i=9; i<=21; i++))
do
    cluster=$(( 2 ** ${i} ))
    qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=${cluster},encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G > /dev/null
    qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}' > /dev/null 2>&1
    if [ $? != 0 ]; then
        echo "Check failed with cluster_size=${cluster}!"
    fi
done

# sh test.sh 
Check failed with cluster_size=2048!
Check failed with cluster_size=4096!

2. There is no the issue for plain qcow2 images without encryption.

#!/bin/bash

for ((i=9; i<=21; i++))
do
    cluster=$(( 2 ** ${i} ))
    qemu-img create -f qcow2 -o cluster_size=${cluster} cluster_size_check.qcow2 1G > /dev/null
    qemu-img check cluster_size_check.qcow2 > /dev/null 2>&1
    if [ $? != 0 ]; then
        echo "Check failed with cluster_size=${cluster}!"
    fi
done

# sh test.sh 
# echo $?
0

Comment 5 Ademar Reis 2019-11-25 15:34:06 UTC

*** Bug 1775480 has been marked as a duplicate of this bug. ***

Comment 7 Daniel Berrangé 2019-11-25 16:03:19 UTC

git bisect blames this commit upstream

a5fff8d4b4d928311a5005efa12d0991fe3b66f9 is the first bad commit
commit a5fff8d4b4d928311a5005efa12d0991fe3b66f9
Author: Vladimir Sementsov-Ogievskiy <vsementsov>
Date:   Wed Feb 27 16:14:30 2019 +0300

    qcow2-refcount: avoid eating RAM
    
    qcow2_inc_refcounts_imrt() (through realloc_refcount_array()) can eat
    an unpredictable amount of memory on corrupted table entries, which are
    referencing regions far beyond the end of file.
    
    Prevent this, by skipping such regions from further processing.
    
    Interesting that iotest 138 checks exactly the behavior which we fix
    here. So, change the test appropriately.
    
    Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov>
    Reviewed-by: Max Reitz <mreitz>
    Message-id: 20190227131433.197063-3-vsementsov
    Signed-off-by: Max Reitz <mreitz>

 block/qcow2-refcount.c     | 19 +++++++++++++++++++
 tests/qemu-iotests/138     | 12 +++++-------
 tests/qemu-iotests/138.out |  5 ++++-
 3 files changed, 28 insertions(+), 8 deletions(-)

Comment 8 Daniel Berrangé 2019-11-25 17:32:05 UTC

IIUC, what's happening here is that the  qcow2_crypto_hdr_init_func() is allocating a series of clusters big enough to hold the LUKS header, and key material that follows it.

We'll only ever initialize key material for the first slot, however, so many of the clusters will remain unwritten, presumed to return all zeroes on future reads but it doesn't really matter if not.

With a5fff8d4b4d928311a5005efa12d0991fe3b66f9 there was some optimization done which appears to believe that all allocated clusters will have had some data written to them. This assumption is invalid for LUKS headers and so it mistakenly complains about corrupt files.

As a workaround we can make the image creation function explicitly write zeros to all clusters allocated for the LUKS header & key material.

Comment 9 Ademar Reis 2019-12-02 17:01:12 UTC

*** Bug 1775480 has been marked as a duplicate of this bug. ***

Comment 11 Ademar Reis 2020-02-05 23:08:48 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 15 Daniel Berrangé 2020-05-05 10:29:48 UTC

Merged in 5.0.0 release upstream as

commit 087ab8e775f48766068e65de1bc99d03b40d1670
Author: Daniel P. Berrangé <berrange>
Date:   Fri Feb 7 13:55:20 2020 +0000

    block: always fill entire LUKS header space with zeros
    
    When initializing the LUKS header the size with default encryption
    parameters will currently be 2068480 bytes. This is rounded up to
    a multiple of the cluster size, 2081792, with 64k sectors. If the
    end of the header is not the same as the end of the cluster we fill
    the extra space with zeros. This was forgetting that not even the
    space allocated for the header will be fully initialized, as we
    only write key material for the first key slot. The space left
    for the other 7 slots is never written to.
    
    An optimization to the ref count checking code:
    
      commit a5fff8d4b4d928311a5005efa12d0991fe3b66f9 (refs/bisect/bad)
      Author: Vladimir Sementsov-Ogievskiy <vsementsov>
      Date:   Wed Feb 27 16:14:30 2019 +0300
    
        qcow2-refcount: avoid eating RAM
    
    made the assumption that every cluster which was allocated would
    have at least some data written to it. This was violated by way
    the LUKS header is only partially written, with much space simply
    reserved for future use.
    
    Depending on the cluster size this problem was masked by the
    logic which wrote zeros between the end of the LUKS header and
    the end of the cluster.
    
    $ qemu-img create --object secret,id=cluster_encrypt0,data=123456 \
       -f qcow2 -o cluster_size=2k,encrypt.iter-time=1,\
                   encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 \
                   cluster_size_check.qcow2 100M
      Formatting 'cluster_size_check.qcow2', fmt=qcow2 size=104857600
        encrypt.format=luks encrypt.key-secret=cluster_encrypt0
        encrypt.iter-time=1 cluster_size=2048 lazy_refcounts=off refcount_bits=16
    
    $ qemu-img check --object secret,id=cluster_encrypt0,data=redhat \
        'json:{"driver": "qcow2", "encrypt.format": "luks", \
               "encrypt.key-secret": "cluster_encrypt0", \
               "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}'
    ERROR: counting reference for region exceeding the end of the file by one cluster or more: offset 0x2000 size 0x1f9000
    Leaked cluster 4 refcount=1 reference=0
    ...snip...
    Leaked cluster 130 refcount=1 reference=0
    
    1 errors were found on the image.
    Data may be corrupted, or further writes to the image may corrupt it.
    
    127 leaked clusters were found on the image.
    This means waste of disk space, but no harm to data.
    Image end offset: 268288
    
    The problem only exists when the disk image is entirely empty. Writing
    data to the disk image payload will solve the problem by causing the
    end of the file to be extended further.
    
    The change fixes it by ensuring that the entire allocated LUKS header
    region is fully initialized with zeros. The qemu-img check will still
    fail for any pre-existing disk images created prior to this change,
    unless at least 1 byte of the payload is written to.
    
    Fully writing zeros to the entire LUKS header is a good idea regardless
    as it ensures that space has been allocated on the host filesystem (or
    whatever block storage backend is used).
    
    Signed-off-by: Daniel P. Berrangé <berrange>
    Message-Id: <20200207135520.2669430-1-berrange>
    Reviewed-by: Eric Blake <eblake>
    Signed-off-by: Max Reitz <mreitz>

Comment 20 Xueqiang Wei 2020-05-26 12:35:46 UTC

Tested with qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2, not hit this issue. So set status to VERIFIED.

Details:

Version:
kernel-4.18.0-193.5.1.el8_2.x86_64
qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2
spice-server-0.14.2-1.el8.x86_64
seavgabios-bin-1.13.0-1.module+el8.2.0+5520+4e5817f3.noarch
seabios-1.13.0-1.module+el8.2.0+5520+4e5817f3.x86_64


Scenario 1(With cluster_size=2k)
# qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=2k,encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G

# qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}'

No errors were found on the image.
Image end offset: 2091008


Scenario 2(With cluster_size=4K)
# qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=4k,encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G

# qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}'

No errors were found on the image.
Image end offset: 2084864



Tried with all the supported cluster_size via below script. All work well.

#!/bin/bash

for ((i=9; i<=21; i++))
do
    cluster=$(( 2 ** ${i} ))
    qemu-img create --object secret,id=cluster_encrypt0,data=redhat -f qcow2 -o cluster_size=${cluster},encrypt.format=luks,encrypt.key-secret=cluster_encrypt0 cluster_size_check.qcow2 1G > /dev/null
    qemu-img check --object secret,id=cluster_encrypt0,data=redhat 'json:{"driver": "qcow2", "encrypt.format": "luks", "encrypt.key-secret": "cluster_encrypt0", "file.driver": "file", "file.filename": "cluster_size_check.qcow2"}' > /dev/null 2>&1
    if [ $? != 0 ]; then
        echo "Check failed with cluster_size=${cluster}!"
    fi
done

# sh test.sh 
# echo $?
0



Tried with plain qcow2 images without encryption. All work well.

#!/bin/bash

for ((i=9; i<=21; i++))
do
    cluster=$(( 2 ** ${i} ))
    qemu-img create -f qcow2 -o cluster_size=${cluster} cluster_size_check.qcow2 1G > /dev/null
    qemu-img check cluster_size_check.qcow2 > /dev/null 2>&1
    if [ $? != 0 ]; then
        echo "Check failed with cluster_size=${cluster}!"
    fi
done

# sh test.sh 
# echo $?
0

Comment 22 errata-xmlrpc 2020-07-28 07:12:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3172