Description of problem: When convert a luks format image, the destination image is always fully allocated. qemu-img should be clever enough to detect the unallocated space or zero sectors from the source image. Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.10.0-17.el7 How reproducible: 100% Steps to Reproduce: 1. Create a luks image # qemu-img create -f luks --object secret,id=sec0,data=base -o key-secret=sec0 base.luks 10G # qemu-img info base.luks image: base.luks file format: luks virtual size: 10G (10737418240 bytes) disk size: 256K encrypted: yes Format specific information: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: ba3558e2-521f-48f6-a947-f5b09b6e85da cipher mode: xts slots: [0]: active: true iters: 742196 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 163512 2. Convert the luks iamge to a luks image # qemu-img convert -p --object secret,id=sec0,data=base --object secret,id=sec1,data=convert --image-opts driver=luks,key-secret=sec0,file.filename=base.luks -O luks -o key-secret=sec1 convert.luks # qemu-img info convert.luks image: convert.luks file format: luks virtual size: 10G (10737418240 bytes) disk size: 10G encrypted: yes Format specific information: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: f003d683-a8d9-4d8d-bd26-f8d0fd33aed8 cipher mode: xts slots: [0]: active: true iters: 734708 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 184331 3. Convert the luks image to a qcow2 image # qemu-img convert -p --object secret,id=sec0,data=base --image-opts driver=luks,key-secret=sec0,file.filename=base.luks -O qcow2 convert.qcow2 # qemu-img info convert.qcow2 image: convert.qcow2 file format: qcow2 virtual size: 10G (10737418240 bytes) disk size: 10G cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4. Convert the luks image to a raw image # qemu-img convert -p --object secret,id=sec0,data=base --image-opts driver=luks,key-secret=sec0,file.filename=base.luks -O raw convert.img # qemu-img info convert.img image: convert.img file format: raw virtual size: 10G (10737418240 bytes) disk size: 10G 5. Convert the luks image to a qcow2 image encrypted by luks # qemu-img convert -p --object secret,id=sec0,data=base --object secret,id=sec1,data=convert --image-opts driver=luks,key-secret=sec0,file.filename=base.luks -O qcow2 -o encrypt.format=luks,encrypt.key-secret=sec1 convert_encrypted_luks.qcow2 # qemu-img info convert_encrypted_luks.qcow2 image: convert_encrypted_luks.qcow2 file format: qcow2 virtual size: 10G (10737418240 bytes) disk size: 12G encrypted: yes cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 encrypt: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: 8c973557-cb36-483b-a26c-ad41cf9d556c format: luks cipher mode: xts slots: [0]: active: true iters: 700170 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 177296 corrupt: false # ls -als convert_encrypted_luks.qcow2 12576348 -rw-r--r-- 1 root root 10741415936 Jan 18 03:01 convert_encrypted_luks.qcow2 Actual results: The destination image is fully allocated Expected results: The destination image should not be fully allocated Additional info:
For rebase operation, there is also the bug. Package tested: qemu-kvm-rhev-2.12.0-3.el7 Steps: 1. Create luks base image # qemu-img create -f luks --object secret,id=sec0,data=base -o key-secret=sec0 base.luks 10G 2. Create qcow2 snapshot based on luks image # qemu-img create -f qcow2 -F luks --object secret,id=sec0,data=base -b 'json:{"driver": "luks", "file": {"driver": "file", "filename": "base.luks"}, "key-secret": "sec0"}' sn1.qcow2 3. Create a new qcow2 image # qemu-img create -f qcow2 new.qcow2 10G 4. Rebase the snapshot to the new qcow2 image # time qemu-img rebase --object secret,id=sec0,data=base -b new.qcow2 sn1.qcow2 real 4m2.070s user 2m45.256s sys 0m24.450s 5. Check the info of the snapshot # qemu-img info sn1.qcow2 image: sn1.qcow2 file format: qcow2 virtual size: 10G (10737418240 bytes) disk size: 10G --------------> fully allocated cluster_size: 65536 backing file: new.qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
Still hit the issue in latest qemu packages. Tested with: qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf kernel-4.18.0-112.el8 Steps: 1. Create source image with luks format # qemu-img create -f luks --object secret,id=sec0,data=base -o key-secret=sec0 base.luks 2G # qemu-img info base.luks image: base.luks file format: luks virtual size: 2.0G (2147483648 bytes) disk size: 256K encrypted: yes Format specific information: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: b0505316-7feb-442b-879c-a22f11c64684 cipher mode: xts slots: [0]: active: true iters: 1147740 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 286864 2. Convert to qcow2 target file # qemu-img convert --object secret,id=sec0,data=base 'json:{"driver": "luks", "file":{"driver": "file", "filename": "base.luks"}, "key-secret": "sec0"}' -O qcow2 tgt.qcow2 -p (100.00/100%) # qemu-img info tgt.qcow2 image: tgt.qcow2 file format: qcow2 virtual size: 2.0G (2147483648 bytes) disk size: 2.0G ---------------------------- Fully allocated! cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 3. Convert to raw target file # qemu-img convert --object secret,id=sec0,data=base 'json:{"driver": "luks", "file":{"driver": "file", "filename": "base.luks"}, "key-secret": "sec0"}' -O raw tgt.img -p (100.00/100%) # qemu-img info tgt.img image: tgt.img file format: raw virtual size: 2.0G (2147483648 bytes) disk size: 2.0G ----------------------- Fully allocatd!
I did an initial analysis of this issue: First of all, the luks driver doesn't implement block status. This means that qemu-img always treats the luks source image as fully allocated. However this is tricky to implement, might not even be worth it. There are several cases where we could do better: 1. The problem with encryption is that encryption pretty much guarantees that if the underlying storage/file is zero, the 'decrypted data' won't be, it will be some random garbage. Sparsely allocated files have holes which read as zero. Thus if we want to convert a luks image to any unencrypted format like raw, plain qcow2 or whatever, and we want both images to have same checksum, you pretty much have to allocate the target image fully. If we accept that unallocated areas in the luks are really garbage and can be changed on the conversion, we can probe the underlying file for holes, and for each hole, the luks can 'lie' that the area is zeroed, and it will be zeroed (by punching holes) in the target image. This can and probably should be done but only as an optional feature for image conversion to avoid inconsistencies. We can make it even somewhat consistent by adding a new flag to the luks driver (say dont_decrypt_zero), which would make it check the read data and if reads all as zero, make an excption and not 'decrypt' it. Again for strict compatibility with original luks format that is wrong, and will slow things down, but if this is an option only used when the luks image is open for conversion, it might not be a bad idea at all. 2. In theory if the target is encrypted as well, and with the same key,like encrypted qcow2 or luks, we could be smarter and detect that after encryption we got all zeros back and punch a hole in the target file instead of writing these zeros. This can slow down things so this should be implemented as an optional (maybe even generic feature?) While converting luks image to luks image is somewhat pointless waste of resources (we can just copy the file in this case), I do see that as a useful (even very useful I would say) option for converting a luks image to qcow2 image encrypted with luks, using the same password. Of course for this case we could be even smarter and pass the encrypted data, and just embed the luks header into the qcow2 file. In fact we could do this even when not knowing the encryption password. I don't know if that fits the qemu-img purpose though. Best regards, Maxim Levitsky
I talked with Max Reitz on IRC and we reached the conclusion that it is just not worth it to support this use case. LUKS just doesn't support sparseness as I explained. If the user needs to convert a legacy LUKS image to qcow2 based encrypted image and restore the sparseness, it is always possible to run fstrim from the guest to zero out all the unused areas to reclaim space. Unless there are objections to that, I'll close this bug with WONTFIX.
I agree with the WONTFIX nature of this; the moment you have encryption, you have to worry about whether sparse regions reveal information about your usage patterns. If you are okay with an attacker knowing how much of the disk you actually use, then punching holes is okay; but it's better to err on the side of safety to fully-allocate an encrypted to have byte-for-byte identical content (even when the content reads as zeros to the guest) than it is to use sparse regions (which read as zero on the host but as garbage to the guest, and reveal that the guest wasn't using that part of the disk). If upstream wants to add knobs to allow sparseness in spite of the security risks, that's one thing - but I don't see a business reason that we have to be the ones adding that feature for downstream usage.
Based on research, closing this as WONTFIX
*** Bug 1702202 has been marked as a duplicate of this bug. ***