RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1405269 - libvirtd crashes when attaching raw LUKS volumes
Summary: libvirtd crashes when attaching raw LUKS volumes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.3
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: rc
: ---
Assignee: John Ferlan
QA Contact: yisun
URL:
Whiteboard:
Depends On:
Blocks: 1411394
TreeView+ depends on / blocked
 
Reported: 2016-12-16 01:51 UTC by Eric Wheeler
Modified: 2017-08-01 23:59 UTC (History)
7 users (show)

Fixed In Version: libvirt-3.0.0-1.el7
Doc Type: Bug Fix
Doc Text:
Cause: The libvirt code assumed that for any domain disk device found to be LUKS encrypted, the device would have a libvirt secret associated with the device in order to provide the key to unlock the device. Consequence: When attempting to access the secret libvirt would core. Fix: Add a check to ensure that not only is there encryption, but there is a secret before trying to access the secret object in order to pass the secret along with the disk. Result: After the patch, it is possible to attach a LUKS encrypted disk and no libvirt secret associated with the disk. This would force the application to perform the unlock.
Clone Of:
: 1411394 (view as bug list)
Environment:
Last Closed: 2017-08-01 17:19:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description Eric Wheeler 2016-12-16 01:51:28 UTC
Description of problem:

Attaching raw LUKS volumes crashes libvirtd.  We unlock the device within, not from the outside.  

It works with libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64, but we got the 7.3 update by surprise and its broken with libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64.


Version-Release number of selected component (if applicable):

libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64

How reproducible:

Very

Steps to Reproduce:
1. cryptsetup luksFormat /dev/vg/something
2. save XML as disk.xml (we are using virtio-scsi, probably doesn't matter):

================= cut disk.xml ===============
<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
  <source dev='/dev/vg/something'/>
  <target dev='sdb' bus='scsi'/>
  <serial>drive-scsi0-0-0-1</serial>
  <alias name='drive-scsi0-0-0-1'/>
  <address type='drive' controller='0' bus='0' target='0' unit='1'/>
</disk>
================= /cut ===============



3. # virsh attach-device my-favorite-vm disk.xml
error: Disconnected from qemu:///system due to I/O error
error: Failed to attach device from /tmp/tmp.GFfYPFAl1C
error: End of file while reading data: Input/output error

Actual results: See #3, above.



Expected results:

Disk should attach, but it doesn't.  Instead, libvirt SEGV's

Additional info:

=== /var/log/messages
Dec 15 17:42:21 hv2.ewheeler.net kernel: libvirtd[13035]: segfault at 0 ip 00007fc737ff885d sp 00007fc743d5c830 error 4 in libvirt_driver_qemu.so[7fc737f90000+144000]
Dec 15 17:42:21 hv2.ewheeler.net systemd: libvirtd.service: main process exited, code=killed, status=11/SEGV
Dec 15 17:42:21 hv2.ewheeler.net systemd: Unit libvirtd.service entered failed state.
Dec 15 17:42:21 hv2.ewheeler.net systemd: libvirtd.service failed.
Dec 15 17:42:22 hv2.ewheeler.net systemd: libvirtd.service holdoff time over, scheduling restart.
Dec 15 17:42:22 hv2.ewheeler.net systemd: Starting Virtualization daemon...
Dec 15 17:42:22 hv2.ewheeler.net systemd: Started Virtualization daemon.

=== When attaching with gdb -p:
Detaching after fork from child process 18395.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe8303700 (LWP 18237)]
0x00007fffdad9b85d in qemuDomainSecretDiskPrepare ()
   from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
(gdb) 
(gdb) bt
#0  0x00007fffdad9b85d in qemuDomainSecretDiskPrepare ()
   from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#1  0x00007fffdadac0a9 in qemuDomainAttachDeviceDiskLive ()
   from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#2  0x00007fffdae1fbe7 in qemuDomainAttachDeviceFlags ()
   from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#3  0x00007ffff73dc616 in virDomainAttachDevice () from /lib64/libvirt.so.0
#4  0x000055555559710f in remoteDispatchDomainAttachDeviceHelper ()
#5  0x00007ffff7443012 in virNetServerProgramDispatch () from /lib64/libvirt.so.0
#6  0x00005555555a7c6d in virNetServerHandleJob ()
#7  0x00007ffff732fd41 in virThreadPoolWorker () from /lib64/libvirt.so.0
#8  0x00007ffff732f0c8 in virThreadHelper () from /lib64/libvirt.so.0
#9  0x00007ffff4952dc5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff468173d in clone () from /lib64/libc.so.6

Comment 1 yisun 2016-12-16 10:43:41 UTC
It's reproduced in my env
3.10.0-514.el7.x86_64
qemu-kvm-rhev-2.6.0-29.el7.x86_64
qemu-img-rhev-2.6.0-29.el7.x86_64
3.10.0-514.el7.x86_64

# lvdisplay
...
  --- Logical volume ---
  LV Path                /dev/rhel_bootp-73-75-161/lvol0
  LV Name                lvol0
  VG Name                rhel_bootp-73-75-161
  LV UUID                0ENvVT-vnxU-U6lZ-Wt7c-Ldrs-pc03-Bwhq2o
  LV Write Access        read/write
  LV Creation host, time bootp-73-75-161.lab.eng.pek2.redhat.com, 2016-12-16 18:27:36 +0800
  LV Status              available
  # open                 1
  LV Size                20.00 MiB
  Current LE             5
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           253:6

# cryptsetup luksFormat /dev/rhel_bootp-73-75-161/lvol0

WARNING!
========
This will overwrite data on /dev/rhel_bootp-73-75-161/lvol0 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase: 
Verify passphrase: 

# cat disk.xml
<disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
        <source dev='/dev/rhel_bootp-73-75-161/lvol0'/>
          <target dev='sdb' bus='virtio'/>
            <serial>drive-scsi0-0-0-1</serial>
              <alias name='drive-scsi0-0-0-1'/>
                <address type='drive' controller='0' bus='0' target='0' unit='1'/>
            </disk>


# virsh attach-device vm1 disk.xml 
error: Disconnected from qemu:///system due to I/O error
error: Failed to attach device from disk.xml
error: End of file while reading data: Input/output error

Comment 2 Eric Wheeler 2016-12-17 20:24:32 UTC
This is probably caused because this is implicit now with disk attachments:
<disk> ...
	<encryption format="default"/>
</disk>

Of course it should fail more gracefully than a segfault, but a proper backward compatible update might introduce the format of "none" and let that be the default, instead of letting "default" be the default which tries to autodetect the disk format.  Perhaps "default" should be renamed "detect".  Fallback to passthrough if no key is loaded would be acceptable too.

It might be best if libvirt doesn't (by default) attempt to detect anything about volumes that are attached.  They should be opaque to the hypervisor unless a non-default setting like format="luks" or perhaps, hypothetically, format="detect".

Thank you for your help!

Comment 3 Eric Wheeler 2016-12-17 20:32:31 UTC
It looks like format=default is only for creation according to this
  https://libvirt.org/formatstorageencryption.html
so perhaps the <encryption> tag isn't implicit---but it should still gracefully passthrough if no secret is loaded into libvirt.

Comment 5 John Ferlan 2016-12-21 23:39:49 UTC
There is a workaround - of sorts - create the secret and add it to your XML. It worked for me...

Assume 'new_vol' has had cryptsetup luksFormat run on it.

# cat secret.xml
<secret ephemeral='no' private='yes'>
   <description>secret libvirt for /dev/LVM_Test/new_vol</description>
   <usage type='volume'>
      <volume>/dev/LVM_Test/new_vol</volume>
   </usage>
</secret>
# virsh secret-define secret.xml
Secret aa3f2251-4506-4d72-9a59-b17ea9d0ffef created

# MYSECRET=`printf %s "libvirt" | base64`

# virsh secret-set-value aa3f2251-4506-4d72-9a59-b17ea9d0ffef $MYSECRET
Secret value set

# cat disk-secret.xml
<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
  <source dev='/dev/LVM_Test/new_vol'/>
  <target dev='sda' bus='scsi'/>
  <encryption format='luks'>
    <secret type='passphrase' uuid='aa3f2251-4506-4d72-9a59-b17ea9d0ffef'/>
  </encryption>
</disk>

# virsh attach-device f23 disk-secret.xml
Device attached successfully

#

So yes, the code shouldn't core and that can be fixed; however, removing automatic recognition of local storage that encrypted using LUKS would remove functionality, so the final solution could be a bit tricky.

I'll have to keep digging a bit more, but I wanted to at least provide some feedback.

FWIW:

The core in question is because there is no secret defined:

#0  qemuDomainSecretDiskPrepare (conn=conn@entry=0x7fffb4000ad0, 
    priv=priv@entry=0x7fffc016c920, disk=disk@entry=0x7fffc04aff60)
    at qemu/qemu_domain.c:1218

code path:

    if (!virStorageSourceIsEmpty(src) && src->encryption &&
        src->encryption->format == VIR_STORAGE_ENCRYPTION_FORMAT_LUKS) {

        if (VIR_ALLOC(secinfo) < 0)
            return -1;

1218:  if (qemuDomainSecretSetup(conn, priv, secinfo, disk->info.alias,
                                  VIR_SECRET_USAGE_TYPE_VOLUME, NULL,
                                  &src->encryption->secrets[0]->seclookupdef,
                                  true) < 0)

where

(gdb) p *disk
$2 = {src = 0x7fffc04665c0, privateData = 0x7fffc00effd0, device = 0, bus = 2, 
  dst = 0x7fffc01a6240 "sda", tray_status = 0, removable = 0, mirror = 0x0, 
  mirrorState = 0, mirrorJob = 0, geometry = {cylinders = 0, heads = 0, 
    sectors = 0, trans = 0}, blockio = {logical_block_size = 0, 
    physical_block_size = 0}, blkdeviotune = {total_bytes_sec = 0, 
    read_bytes_sec = 0, write_bytes_sec = 0, total_iops_sec = 0, 
    read_iops_sec = 0, write_iops_sec = 0, total_bytes_sec_max = 0, 
    read_bytes_sec_max = 0, write_bytes_sec_max = 0, total_iops_sec_max = 0, 
    read_iops_sec_max = 0, write_iops_sec_max = 0, size_iops_sec = 0, 
    group_name = 0x0, total_bytes_sec_max_length = 0, 
    read_bytes_sec_max_length = 0, write_bytes_sec_max_length = 0, 
    total_iops_sec_max_length = 0, read_iops_sec_max_length = 0, 
    write_iops_sec_max_length = 0}, serial = 0x0, wwn = 0x0, vendor = 0x0, 
  product = 0x0, cachemode = 1, error_policy = 0, rerror_policy = 0, 
  iomode = 1, ioeventfd = 0, event_idx = 0, copy_on_read = 0, snapshot = 0, 
  startupPolicy = 0, transient = false, info = {
    alias = 0x7fffc02a36c0 "scsi0-0-0", type = 2, addr = {pci = {domain = 0, 
        bus = 0, slot = 0, function = 0, multi = 0}, drive = {controller = 0, 
        bus = 0, target = 0, unit = 0}, vioserial = {controller = 0, bus = 0, 
        port = 0}, ccid = {controller = 0, slot = 0}, usb = {bus = 0, port = {
          0, 0, 0, 0}}, spaprvio = {reg = 0, has_reg = false}, ccw = {
        cssid = 0, ssid = 0, devno = 0, assigned = false}, isa = {iobase = 0, 
        irq = 0}, dimm = {slot = 0, base = 0}}, mastertype = 0, master = {
      usb = {startport = 0}}, rombar = 0, romfile = 0x0, bootIndex = 0, 
    pciConnectFlags = 0}, rawio = 0, sgio = 0, discard = 1, iothread = 0, 
  detect_zeroes = 0, domain_name = 0x0}
(gdb) p src->encryption
$3 = (virStorageEncryptionPtr) 0x7fffc00ed570
(gdb) p *src->encryption
$4 = {format = 2, nsecrets = 0, secrets = 0x0, encinfo = {cipher_size = 0, 
    cipher_name = 0x0, cipher_mode = 0x0, cipher_hash = 0x0, ivgen_name = 0x0, 
    ivgen_hash = 0x0}}

The 'nsecrets = 0' and 'secrets = 0x0' where the deference in the call is "src->encryption->secrets[0]->seclookupdef"

Comment 6 John Ferlan 2016-12-22 14:22:06 UTC
This issue is limited to hotplug of a disk device for which libvirt has determined the "on disk" format has LUKS encryption. The determination of the format is part of the processing of live disk attachment that doesn't occur during domain startup/code plug processing.

I have posted a patch upstream that will resolve the issue.

http://www.redhat.com/archives/libvir-list/2016-December/msg01114.html

Comment 7 Eric Wheeler 2016-12-22 17:27:31 UTC
Awesome, thank you!

For the work-around in #5, does an incorrect key still pass the disk through unlocked, or is the example above assuming the correct key is being used?

-Eric

Comment 8 John Ferlan 2016-12-22 17:49:12 UTC
You'd need a valid secret; otherwise, qemu won't add the disk and the drive_add from libvirt receives an error:

# MYSECRET=`printf %s "badsecret" | base64`
# virsh secret-set-value aa3f2251-4506-4d72-9a59-b17ea9d0ffef $MYSECRET
# virsh attach-device f23 disk-secret.xml
error: Failed to attach device from disk-secret.xml
error: internal error: unable to execute QEMU command 'device_add': Property 'scsi-hd.drive' can't find value 'drive-scsi0-0-0-0'

#

Comment 9 Eric Wheeler 2016-12-30 04:12:25 UTC
Understood.  

Your patch in c#6 slated for inclusion into the next libvirt package?  It blocks our adoption of el7.3 until then.

Thanks!

-Eric

Comment 10 John Ferlan 2016-12-30 13:11:07 UTC
Typically the process is - get it reviewed/ACK'd to be included in an upstream version... Then once there I can create a downstream patch for inclusion in a 7.3.z release. I don't have any idea "when" that 7.3.z release is generated, but I will look to get this included for that.

Comment 11 John Ferlan 2017-01-03 18:16:02 UTC
The change has been pushed upstream for 7.4.  Moving to POST, but a rhel 7.3.z cloned bz still needs to be created.

$ git show 

commit 7f7d99048350935a394d07b98a13d7da9c4b0502
Author: John Ferlan <jferlan>
Date:   Thu Dec 22 07:12:49 2016 -0500

    qemu: Don't assume secret provided for LUKS encryption
    
...
    
    If a secret was not provided for what was determined to be a LUKS
    encrypted disk (during virStorageFileGetMetadata processing when
    called from qemuDomainDetermineDiskChain as a result of hotplug
    attach qemuDomainAttachDeviceDiskLive), then do not attempt to
    look it up (avoiding a libvirtd crash) and do not alter the format
    to "luks" when adding the disk; otherwise, the device_add would
    fail with a message such as:
    
       "unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
        can't find value 'drive-scsi0-0-0-0'"
    
    because of assumptions that when the format=luks that libvirt would have
    provided the secret to decrypt the volume.
    
    Access to unlock the volume will thus be left to the application.


$ git describe 7f7d99048350935a394d07b98a13d7da9c4b0502
v2.5.0-284-g7f7d990
$

Comment 14 yisun 2017-01-17 10:54:01 UTC
tested with patch and passed, wait for new build, just record the test steps here. 

# pvcreate /dev/sdb
# vgcreate vg1 /dev/sdb
# lvcreate vg1  /dev/sdb --size 10M
# lvdisplay 
  --- Logical volume ---
  LV Path                /dev/vg1/lvol0
 ...
# cryptsetup luksFormat /dev/vg1/lvol0
# cat /tmp/disk.xml
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
      <source dev='/dev/vg1/lvol0'/>
      <backingStore/>
      <target dev='sdb' bus='virtio'/>
      <serial>drive-scsi0-0-0-1</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
# virsh attach-device avocado-vt-vm1 /tmp/disk.xml 
Device attached successfully

# virsh dumpxml avocado-vt-vm1
...
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
      <source dev='/dev/vg1/lvol0'/>
      <backingStore/>
      <target dev='sdb' bus='virtio'/>
      <serial>drive-scsi0-0-0-1</serial>
      **<encryption format='luks'>**
      **</encryption>**
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
...

# virsh console avocado-vt-vm1
Connected to domain avocado-vt-vm1
Escape character is ^]
    # lsblk
    ...
    vdb           252:16   0   12M  0 disk 
    
    # mount /dev/vdb /mnt
    mount: unknown filesystem type 'crypto_LUKS'

# virsh detach-device avocado-vt-vm1 /tmp/disk.xml
Device detached successfully

# virsh domblklist avocado-vt-vm1
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/jeos-21-64.qcow2

Comment 18 yisun 2017-02-24 04:17:56 UTC
Verified with:
libvirt-3.0.0-2.el7.x86_64
qemu-kvm-rhev-2.8.0-4.el7.x86_64



## pvcreate /dev/sdd

## vgcreate vg_luks /dev/sdd
  Volume group "vg_luks" successfully created


## lvcreate vg_luks  /dev/sdd --size 10M
  Rounding up size to full physical extent 12.00 MiB
  Logical volume "lvol0" created.

## cryptsetup luksFormat /dev/vg_luks/lvol0

WARNING!
========
This will overwrite data on /dev/vg_luks/lvol0 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase: 
Verify passphrase: 


## cat /tmp/luks.disk 
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
      <source dev='/dev/vg_luks/lvol0'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <serial>drive-scsi0-0-0-1</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>


## virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started


## virsh attach-device avocado-vt-vm1 /tmp/luks.disk 
Device attached successfully

## virsh console avocado-vt-vm1
Connected to domain avocado-vt-vm1

[root@yisun_vm1 ~]# lsblk
...
vdb           252:16   0  12M  0 disk 

[root@yisun_vm1 ~]# mount /dev/vdb /mnt
mount: unknown filesystem type 'crypto_LUKS'

Comment 19 errata-xmlrpc 2017-08-01 17:19:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 20 errata-xmlrpc 2017-08-01 23:59:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.