Bug 1371022 - Need to use blockdev-mirror instead of drive-mirror in order to support LUKS disks
Summary: Need to use blockdev-mirror instead of drive-mirror in order to support LUKS ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.1
Assignee: Peter Krempa
QA Contact: Han Han
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-29 07:40 UTC by Yang Yang
Modified: 2020-05-05 09:45 UTC (History)
15 users (show)

Fixed In Version: libvirt-6.0.0-4.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 09:43:16 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
libvirtd.log on target host (400.62 KB, text/plain)
2016-08-29 07:41 UTC, Yang Yang
no flags Details
disk xml destinations (2.38 KB, application/gzip)
2020-03-19 07:32 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2017 0 None None None 2020-05-05 09:45:46 UTC

Description Yang Yang 2016-08-29 07:40:09 UTC
Description of problem:
I have a running vm using luks encrypted disk on source host. I create same luks secret on both source and target hosts. I attempte to migrate vm with copy-storage-all flag to target host but failed. Libvirt creates the luks disk on target host but w/o encryption so that qemu starts up vm on target host failed

Version-Release number of selected component (if applicable):
libvirt-2.0.0-6.el7.x86_64
qemu-kvm-rhev-2.6.0-22.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. define a luks secret on both source and target
# cat secret-luks.xml
<secret ephemeral='no' private='yes'>
  <description>LUKS Sample Secret</description>
  <usage type='volume'>
    <volume>luks.secret</volume>
  </usage>
</secret>

#virsh secret-define secret-luks.xml

# MYSECRET=`printf %s "letmein" | base64`

# virsh secret-set-value a133d117-241c-4f7d-8ca4-56e1ea316f71 $MYSECRET

# virsh secret-list
 UUID                                  Usage
--------------------------------------------------------------------------------
 a133d117-241c-4f7d-8ca4-56e1ea316f71  volume luks.secret

2. create luks volume
# cat vol-luks-2.xml 
<volume>
  <name>luks.img</name>
  <source>
  </source>
  <capacity unit="G">1</capacity>
  <target>
    <format type='raw'/>
    <encryption format='luks'>
      <secret type='passphrase' usage='luks.secret'/>
    </encryption>
  </target>
</volume>

# virsh vol-create default vol-luks-2.xml 
Vol luks.img created from vol-luks-2.xml

# qemu-img info /var/lib/libvirt/images/luks.img 
image: /var/lib/libvirt/images/luks.img
file format: luks
virtual size: 1.0G (1073741824 bytes)
disk size: 256K
encrypted: yes

3.define /start vm
<disk type='file' device='disk'>
      <driver name='qemu' type='raw' io='threads' ioeventfd='on' event_idx='off' detect_zeroes='on'/>
      <source file='/var/lib/libvirt/images/luks.img'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <encryption format='luks'>
        <secret type='passphrase' usage='luks.secret'/>
      </encryption>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </disk>

4. migrate vm
# virsh migrate vm-mig qemu+ssh://10.66.4.152/system --verbose --copy-storage-all
root.4.152's password: 
error: internal error: process exited while connecting to monitor: 2016-08-29T07:35:14.496860Z qemu-kvm: -drive file=/var/lib/libvirt/images/luks.img,key-secret=virtio-disk1-luks-secret0,format=luks,if=none,id=drive-virtio-disk1,detect-zeroes=on,aio=threads: Volume is not in LUKS format

5. check luks vol in target host
# qemu-img info /var/lib/libvirt/images/luks.img 
image: /var/lib/libvirt/images/luks.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G


Actual results:
Migrate vm with copy-storage-all flag for luks encrypted disk failed

Expected results:
Migration works

Additional info:
Addressed libvirtd.log on target host, libvirt create luks vol w/o encryption

2016-08-29 07:35:14.194+0000: 3948: debug : virStorageVolLookupByName:1293 : pool=0x7efe98001450, name=luks.img
2016-08-29 07:35:14.194+0000: 3948: debug : storageVolLookupByName:1528 : Storage volume not found: no storage vol with matching name 'luks.img'
2016-08-29 07:35:14.194+0000: 3948: debug : virStorageVolCreateXML:1466 : pool=0x7efe98001450, xmlDesc=<volume>
  <name>luks.img</name>
  <capacity>1073741824</capacity>
  <target>
    <format type='raw'/>
  </target>
</volume>
, flags=0
2016-08-29 07:35:14.208+0000: 3948: info : storageVolCreateXML:1996 : Creating volume 'luks.img' in storage pool 'default'

Comment 1 Yang Yang 2016-08-29 07:41:34 UTC
Created attachment 1195195 [details]
libvirtd.log on target host

Comment 3 John Ferlan 2016-09-01 13:12:21 UTC
Not sure this is going to be possible with the current code. Migration of disks isn't something I'm overly familiar with, but in general what I see happening is libvirt will create an NBD connection and use the 'drive-mirror' command to transfer disks (qemuMonitorJSONDriveMirror called eventually from qemuMigrationDriveMirror).

Looking at the QEMU code for drive-mirror, I'm not sure creation of a luks volume was added. I can see in the qemu-img code a call to bdrv_img_create with "options" in the 5th parameter.  The options are how qemu-img handles the "--object secret,id=xxx,..." and "-o key-secret=xxx" for the "-f luks" volume.  Compared to the qmp_drive_mirror code the two calls to bdrv_img_create pass NULL to the bdrv_img_create command.

I've placed a needsinfo on Dan Berrange who handled the qemu luks code to verify what I've seen or perhaps point me in the right direction to get this to work.

Comment 4 Daniel Berrangé 2016-09-05 10:36:14 UTC
John's correct - no support was added for creating LUKS volumes during drive-mirror.

Comment 5 Daniel Berrangé 2016-09-29 09:12:11 UTC
It turns out that QEMU 2.6 added a new "blockdev-mirror" command which is intended to replace "drive-mirror". This new command lets us provide image options, mostly particularly the key-secret required by LUKS, when mirroring. It is slightly different in that it can only operate on pre-existing images, so we would need to use qemu-img to create the new image upfront if needed.

Anyway, use of this new command should let libvirt fix the luks disk problem.

Comment 6 Daniel Berrangé 2016-09-29 09:16:24 UTC
See also https://lists.gnu.org/archive/html/qemu-block/2016-09/msg00834.html

Comment 9 Jaroslav Suchanek 2019-04-24 12:26:31 UTC
This bug is going to be addressed in next major release.

Comment 10 Peter Krempa 2019-11-27 08:40:03 UTC
blockdev-mirror is used for the copy job if blockdev is enabled since:

commit ce7229a3b0d28479e0f123efce3fa73617889a50
Author: Peter Krempa <pkrempa>
Date:   Mon Jul 22 13:59:01 2019 +0200

    qemu: Add blockdev support for the block copy job
    
    Implement job handling for the block copy job (drive/blockdev-mirror)
    when using -blockdev. In contrast to the previously implemented
    blockjobs the block copy job introduces new images to the running qemu
    instance, thus requires a bit more handling.
    
    When copying to new images the code now makes use of blockdev-create to
    format the images explicitly rather than depending on automagic qemu
    behaviour.

The blockdev feature was enabled since:

commit c6a9e54ce3252196f1fc6aa9e57537a659646d18
Author: Peter Krempa <pkrempa>
Date:   Mon Jan 7 11:45:19 2019 +0100

    qemu: enable blockdev support

    Now that all pieces are in place (hopefully) let's enable -blockdev.

    We base the capability on presence of the fix for 'auto-read-only' on
    files so that blockdev works properly, mandate that qemu supports
    explicit SCSI id strings to avoid ABI regression and that the fix for
    'savevm' is present so that internal snapshots work.

v5.9.0-390-gc6a9e54ce3

and requires upstream qemu-4.2 or appropriate downstream.

Comment 12 Han Han 2020-01-20 04:31:49 UTC
Verified on libvirt-6.0.0-1.module+el8.2.0+5453+31b2b136.x86_64 qemu-kvm-4.2.0-6.module+el8.2.0+5453+31b2b136.x86_64:
1. Start an VM
2. Prepare a luks disk
# cat sec.xml                                                                                                                                                                                               
<secret ephemeral='no' private='yes'>
   <usage type='volume'>
      <volume>/var/lib/libvirt/images/luks-dest</volume>
   </usage>
</secret>

# qemu-img create -f luks --object secret,data=12345,id=sec0 -o key-secret=sec0 /var/lib/libvirt/images/luks-dest 10G

# virsh secret-define sec.xml

# virsh secret-set-value 73ad8135-0e18-4cb2-b98d-4e6daaf7c0ce --base64 $(printf 12345 | base64)

# cat luks.xml                                                                                                       
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/luks-dest'/>
      <target dev='vda' bus='virtio'/>
      <encryption format='luks'>
        <secret type='passphrase' usage='/var/lib/libvirt/images/luks-dest'/>
      </encryption>
    </disk>

3. Start blockcopy job:
# virsh blockcopy pc vda --reuse-external --xml luks.xml --transient-job --wait --verbose --pivot                 
Block Copy: [100 %]

4. Do copy storage migration:
Prepare luks disk and secret on destination host
# virsh migrate pc qemu+ssh://rhel82-nest2.usersys.redhat.com/system --verbose --copy-storage-all                                                                                                              
Migration: [100 %]

Comment 13 Peter Krempa 2020-01-20 12:01:44 UTC
Actually I don't think this works properly. For the image to actually use LUKS the <encryption> element must be a subelement of <source>. You've created a non-encrypted image.

Additionally while testing it I found a bug in the code where we actually wouldn't format the LUKS image due to a logic bug.

Moving back to ASSIGNED. The fix for creation of the LUKS disk is trivial but I'll also need to improve documentation.

Comment 14 Peter Krempa 2020-02-05 08:15:07 UTC
Formatting of raw images fixed upstream:

f4e7c792d5 qemu: block: Don't skip creation of 'luks' formatted images

Comment 16 Han Han 2020-03-19 07:16:52 UTC
Known issues:
for luks usage: https://bugzilla.redhat.com/show_bug.cgi?id=1814923
for nvme disk:  https://bugzilla.redhat.com/show_bug.cgi?id=1814947
for scsi pr:    https://bugzilla.redhat.com/show_bug.cgi?id=1814962

Version:
libvirt-6.0.0-13.virtcov.el8.x86_64
qemu-kvm-4.2.0-15.module+el8.2.0+6029+618ef2ec.x86_64

Comment 17 Han Han 2020-03-19 07:32:07 UTC
Created attachment 1671356 [details]
disk xml destinations

Test on libvirt-6.0.0-13.virtcov.el8.x86_64 qemu-kvm-4.2.0-15.module+el8.2.0+6029+618ef2ec.x86_64
1. Prepare block, gluster cluster, nbd server, rbd cluster, iscsi server
2. Prepare luks file on these backends above
3. Prepare secrets for luks, iscsi, rbd
4. Start VM do blockcopy with --reuse-external
for i in *dest.xml;do
        echo $i:                                                                                                                                                                       
        virsh blockcopy new sda --xml $i --pivot --transient-job --verbose --wait --reuse-external
        virsh detach-disk new sda
        sleep 5
        virsh attach-disk new /tmp/disk sda
done

Results:
block-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully

file-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully

gluster-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully

iscsi-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully

nbd-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully

rbd-dest.xml:
Block Copy: [100 %]
Successfully pivoted
Disk detached successfully

Disk attached successfully


For migration with --copy-storage-all, I will test it after https://bugzilla.redhat.com/show_bug.cgi?id=1814923 fixed

Comment 18 Han Han 2020-03-19 08:25:49 UTC
For blockcopy with slice&luks in dest xml, I will test it after https://bugzilla.redhat.com/show_bug.cgi?id=1814975 fixed

Comment 20 Han Han 2020-04-09 07:50:31 UTC
Test for block-mirror of copy storage migration on libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 qemu-kvm-4.2.0-17.module+el8.2.0+6141+0f540f16.x86_64

1. To make the testing scenarios more complex, setup env for the tls migration tunnel:
Prepare tls CA&client certs on the src host, CA&server certs on the dest host
Enable tls verify on qemu.conf:
default_tls_x509_cert_dir = "/etc/pki/qemu"
default_tls_x509_verify = 1


2. Prepare file, block, iscsi, nbd, nvme, gluster, rbd environments
To make the tests more complex
   prepare iscsi with chap auth, rbd with cephx auth, nbd with tls creds
   prepare rbd and gluster with multi-nodes


3. Create a qcow2 luks file
# qemu-img create --object secret,data=redhat,id=sec0 -f qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 /tmp/new 100M

Convert it to these block backends above:
# qemu-img convert [--object,...] /tmp/new 'json:{"file": BLOCKDEV_SCHEMA }' -O raw -f raw


4. Prepare secrets and domain xml with the disk backends:
Block:
<disk type="block" device="disk">
  <driver name="qemu" type="qcow2"/>
  <source dev="/dev/vg/new">
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>

File:
<disk type="file" device="disk">
  <driver name="qemu" type="qcow2"/>
  <source file="/var/lib/libvirt/images/luks-dest">
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>

Gluster:
<disk type="network" device="disk">
  <driver name="qemu" type="qcow2"/>
  <source protocol='gluster' name='gv/luks-dest'>
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
    <host name='NODE1'/>
    <host name='NODE2'/>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>

Iscsi:
<disk type="network" device="disk">
  <driver name="qemu" type="qcow2"/>
  <source protocol='iscsi' name='iqn.2020-03.com.libvirt:iscsi-chap/1'>
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
    <auth username='redhat'>
      <secret type='iscsi' usage='iscsi'/>
    </auth>
    <host name='HOST'/>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>

Nbd:
<disk type="network" device="disk">
  <driver name="qemu" type="qcow2"/>
  <source protocol='nbd' name='nbd-tls' tls='yes'>
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
    <host name='HOST' port='10811'/>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>

Nvme:
<disk type="nvme" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source type="pci" managed="yes" namespace="1">
        <address domain="0x0000" bus="0x44" slot="0x00" function="0x0"/>
	    <encryption format="luks">
	      <secret type="passphrase" usage="luks"/>
	    </encryption>
      </source>
      <target dev="sda" bus="scsi"/>
    </disk>

Rbd:
<disk type="network" device="disk">
  <driver name="qemu" type="raw"/>
  <source protocol='rbd' name='rbd/luks-dest'>
    <encryption format="luks">
      <secret type="passphrase" usage="luks"/>
    </encryption>
    <auth username='admin'>
      <secret type='ceph' usage='ceph_example'/>
    </auth>
    <host name='NODE1'/>
    <host name='NODE2'/>
    <host name='NODE3'/>
  </source>
  <target dev="sda" bus="scsi"/>
</disk>


4. Prepare migration env(hostname, add ports on firewall for migration)

5. Migrate with --tls --copy-storage-all --xml
# virsh migrate fedora31 qemu+ssh://DEST_HOST/system --copy-storage-all --tls --xml file-dest.xml-dom.xml --verbose


Test results:
File: pass
Block: pass
Iscsi: pass
Rbd: pass
Gluster: pass
But if gv/luks-dest is not pre-created, it will report a Pemission Denied error and then trigger segment fault of https://bugzilla.redhat.com/show_bug.cgi?id=1783187

Nvme:
# virsh migrate fedora31 qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --copy-storage-all --tls --xml nvme-dest.xml-dom.xml --verbose
error: internal error: cannot precreate storage for disk type 'nvme'

Nbd:
# virsh migrate fedora31 qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --copy-storage-all --xml nbd-dest.xml-dom.xml
error: operation failed: migration of disk sda failed: Invalid argument
And it could be reproduced without --tls

Comment 21 Han Han 2020-04-09 08:27:36 UTC
NBD passed when use the qcow2 luks file as nbd backend

Comment 23 errata-xmlrpc 2020-05-05 09:43:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.