RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1809453 - [RFE] Add support for LUKS encrypted disks with Clevis & Tang
Summary: [RFE] Add support for LUKS encrypted disks with Clevis & Tang
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libguestfs
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: YongkuiGuo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-03 07:31 UTC by Fabien Dupont
Modified: 2022-11-15 10:13 UTC (History)
15 users (show)

Fixed In Version: libguestfs-1.48.4-2.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-15 09:52:35 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:7958 0 None None None 2022-11-15 09:53:18 UTC

Description Fabien Dupont 2020-03-03 07:31:45 UTC
Description of problem:
Among customers using Linux with LUKS encryption, some use Network Bound Disk Encryption with Clevis & Tang [1]. It would be great to have support for Clevis & Tang in libguestfs.

A possible implementation would be to have a new set of options for Network Block Device Encryption (NBDE) mechanisms, i.e. Clevis & Tang for Linux and BitLocker Network Unlock for Windows.

[1] https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Security_Guide/sec-Using_Network-Bound_Disk_Encryption.html

Comment 5 RHEL Program Management 2021-09-03 07:27:07 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 6 Richard W.M. Jones 2021-09-03 09:17:22 UTC
This bug was closed in error by a process we do not control.  Reopening.

Comment 9 Laszlo Ersek 2022-02-04 12:31:05 UTC
- Bug 1658126 is about the guest internally using LUKS-on-LV, that is; formatting an LVM Logical Volume with cryptsetup (LUKS), and then placing a filesystem (incl. swap) on that. So we get something like:

# lsblk
NAME                                        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda                                           8:0    0    9G  0 disk  
├─sda1                                        8:1    0    1G  0 part  /boot
└─sda2                                        8:2    0    8G  0 part  
  ├─rhel-root                               253:0    0  7.1G  0 lvm   
  │ └─luks-07934cf9-3cef-4677-8ce3-0eb047d641f0
                                            253:3    0  7.1G  0 crypt /
  └─rhel-swap                               253:1    0  924M  0 lvm   
    └─luks-262fb950-61a0-47f7-a1f6-bb23d8a5fddf
                                            253:2    0  922M  0 crypt [SWAP]

- Bug 1451665 (solved) is about the guest internally using PV-on-LUKS, that is; formatting a traditional partition  with cryptsetup (LUKS), then using the encrypted block device as an LVM Physical Volume, then creating LVs in the resultant volume group, and then placing swap and filesystems on those LVs. So we get something like:

# lsblk
NAME                          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
xvda                          202:0    0 14.7G  0 disk  
├─xvda1                       202:1    0  500M  0 part  /boot
└─xvda2                       202:2    0 14.2G  0 part  
  └─luks-6b952cdc-9d5b-426d-bd0c-1ba16e00b4eb (dm-0)
                              253:0    0 14.2G  0 crypt 
    ├─VolGroup-lv_root (dm-1) 253:1    0 12.7G  0 lvm   /
    └─VolGroup-lv_swap (dm-2) 253:2    0  1.5G  0 lvm   [SWAP]
sr0                            11:0    1 1024M  0 rom

(I vaguely recall Anaconda switching between the two layouts at some point in the Fedora release history!)

- Bug 1398191 is about the guest NOT using cryptsetup internally at all; instead, the storage volume on the HOST side -- a regular file or a Logical Volume -- is encrypted, where QEMU performs LUKS encryption/decryption in userspace, and libvirtd manages the setup <https://libvirt.org/formatstorageencryption.html>. (In bug 1398191, only "scenario 1" matters.)

- Bug 1809453 is a feature request that applies to disk encryption *internal* to the guest. It applies to both LUKS-on-LV and PV-on-LUKS in Linux guests, and applies to BitLocker in Windows guests. The idea is that the secret (for decrypting the disks internally to the libguestfs appliance) is retrieved over the network <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening>, and not specified by the end-user (cf. --key, --keys-from-stdin for virt-v2v).

Comment 10 Laszlo Ersek 2022-06-17 09:52:33 UTC
I've skimmed the documentation. More careful reading will be required, but at least (a) this looks interesting (even just setting up a basic configuration between VMs that employ these technologies), (b) the documentation does *not* indicate that a TPM is part of the picture (remember that the TPM requirement in Microsoft's BitLocker network unlock scheme was what killed bug 1808980).

Namely, <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening> says that

- Clevis has separate pins for "tpm2" and "tang" (and the scheme does not seem to involve the former), plus the documentation says that

- "Tang is a server for binding data to network presence. It makes a system containing your data available when the system is bound to a certain secure network."

Which makes a big difference relative to Microsoft's scheme. Virt-v2v *can* be on the same secure network, and so it can (at least in theory) impersonate the original guest.

I'd like to investigate this more; again, minimally there's a personal learning experience hidden in this.

Comment 11 Richard W.M. Jones 2022-06-17 09:59:27 UTC
FWIW I set up some VMs for testing using Clevis/Tang and as
far as I remember it was both easy and didn't need TPMs, virtual or real.

Comment 12 Laszlo Ersek 2022-06-20 14:30:25 UTC
I've got a working "clevis VM <-> tang VM" setup (RHEL9 VMs).

The RHEL9 documentation <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/security_hardening/index#configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening> is good.

Regarding unlocking, the clevis-luks-unlockers(7) manual page provides a nice summary. It explains the "early" and "late" disk unlocking that's also described in the RHEL9 docs (linked above). In my test setup, I've enabled both; however, given that it's the system disk that's encrypted, I think it's the early (aka dracut) unlocker that's actually active.

However, likely more importantly for libguestfs, clevis-luks-unlockers(7) also references the clevis-luks-unlock(1) utility, which "Unlocks manually using the command line" and "unlocks a LUKS device using its already provisioned Clevis policy".

That is, if the libguestfs daemon can execute "clevis-luks-unlock" against a LUKS device, *and* the appliance has networking enabled, then a matching plaintext device should appear under "/dev/mapper/luks-UUID". The URL(s) of the Tang server(s) are embedded in the clevis token in one of the LUKS keyslots, using JSON syntax. Initially I thought that we'd have to parse that ourselves [*]; however, it seems like "clevis-luks-unlock" can do all that. We only need the appliance to have networking enabled, *and* the Tang server(s) to be reachable (under the same names / IP addresses as stored in the keyslot / clevis token) from the network where the appliance (or v2v conversion server) runs.

[*] clevis-luks-unlock(1) is a shell script (with some shell functions shared by multiple clevis utilities factored out to a separate shell script file: "/usr/bin/clevis-luks-common-functions"), effectively a very thick -- multi-layer -- wrapper around "cryptsetup open", "jose" (for the crypto operations) and "curl" (for contacting the Tang server).

Rudimentary speculation: we should add a new tri-state flag to <https://libguestfs.org/guestfs.3.html#guestfs_cryptsetup_open>. Value 0: don't attempt Clevis unlocking. Value 1: enable Clevis unlocking in addition to the usual passphrase-based unlocking (in "daemon/cryptsetup.ml", attempt executing "clevis-luks-unlock" *before* the usual stuff). Value 2: enable Clevis unlocking *only* (disable passphrase-based unlocking) -- in the last case, the "key" value would be ignored. The code is in "daemon/cryptsetup.ml".

The decrypt_mountables() function in "libguestfs-common/options/decrypt.c" will also need modification (which is used whenever inspection is requested). Currently it bails out if no key for a particular UUID has been entered, but it should offer a mode for decrypting with clevis only.

For the "--key SELECTOR" option of the various utilities, we should introduce selector formats like:

--key ID:clevis:<ignored>
--key ID:clevis-then-key:KEY_STRING
--key ID:clevis-then-file:FILENAME

The first new format would map to Value 2; the last two new formats would map to value 1.

Again, just some speculation in advance.

Comment 13 Laszlo Ersek 2022-06-21 12:41:38 UTC
Upon a closer look, I think we'll need a brand new API. Reasons:

- guestfs_cryptsetup_open() permits "crypttype" = bitlk, which makes no sense with clevis

- guestfs_cryptsetup_open() permits "readonly" = 1, but read-only mappings are not supported by clevis-luks-unlock(1)

- guestfs_cryptsetup_open() requires "key", but "key" is useless if clevis-luks-unlock(1) succeeds.

- guestfs_cryptsetup_open() depends on feature "luks", but for clevis we need a new feature dependency, and such dependencies are expressed at the API level.

Comment 14 Dr. David Alan Gilbert 2022-06-21 13:48:04 UTC
you might also like to look at systemd-cryptenroll as clevis.

Comment 15 Laszlo Ersek 2022-06-21 16:51:10 UTC
I've got a working libguestfs patch for wrapping clevis-luks-unlock(1); the following works:

guestfish --ro -d clevis --network \
  launch : \
  clevis-luks-unlock /dev/sda2 foobar : \
  lvm-scan true : \
  list-filesystems : \
  mount /dev/rhel_clevis/root / : \
  ll /

I've tested some failure modes as well (Tang server down, "--network" option absent, Tang server name unresolvable).

Next, "libguestfs-common/options/decrypt.c" will have to put it to use (also a new syntax for --key will be needed; likely "--key ID:clevis:<ignored>").

(In reply to Dr. David Alan Gilbert from comment #14)
> you might also like to look at systemd-cryptenroll as clevis.

<https://www.freedesktop.org/software/systemd/man/systemd-cryptenroll.html> seems hardware token-oriented; I think that's pretty difficult to support with libguestfs (and out of scope for this BZ anyway). TPM2 will simply not work (comment 10); YubiKeys might theoretically work with USB passthrough (probably complex).

Comment 16 Laszlo Ersek 2022-06-28 11:59:19 UTC
[Libguestfs] LUKS decryption with Clevis+Tang | CVE-2022-2211
Message-Id: <e5f2b088-7aef-c3bc-b660-d11dd0f55f1d>
https://listman.redhat.com/archives/libguestfs/2022-June/029274.html

[libguestfs-common PATCH 00/12] LUKS decryption with Clevis+Tang | CVE-2022-2211
Message-Id: <20220628114915.5030-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029277.html

[libguestfs PATCH 0/3] LUKS decryption with Clevis+Tang | CVE-2022-2211
Message-Id: <20220628115418.5376-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029290.html

[guestfs-tools PATCH 0/4] LUKS decryption with Clevis+Tang
Message-Id: <20220628115702.5584-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029293.html

[v2v PATCH] convert: document networking dependency of "--key ID:clevis"
Message-Id: <20220628115856.5820-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029298.html

Comment 17 Laszlo Ersek 2022-06-29 13:35:48 UTC
(In reply to Laszlo Ersek from comment #16)
> [libguestfs-common PATCH 00/12] LUKS decryption with Clevis+Tang | CVE-2022-2211
> Message-Id: <20220628114915.5030-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029277.html

The CVE fix (the first patch in this series) has been pushed upstream: commit 35467027f657.

> [libguestfs PATCH 0/3] LUKS decryption with Clevis+Tang | CVE-2022-2211
> Message-Id: <20220628115418.5376-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029290.html

The documentation of the CVE (the first patch in this series) has been pushed upstream: commit 99844660b48e.

Comment 18 Laszlo Ersek 2022-06-30 12:21:24 UTC
[libguestfs-common PATCH v2 00/12] LUKS decryption with Clevis+Tang
Message-Id: <20220630122028.19283-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029352.html

[libguestfs PATCH v2 0/3] LUKS decryption with Clevis+Tang
Message-Id: <20220630122048.19335-1-lersek>
https://listman.redhat.com/archives/libguestfs/2022-June/029365.html

Comment 19 Laszlo Ersek 2022-07-01 13:01:47 UTC
(In reply to Laszlo Ersek from comment #18)
> [libguestfs-common PATCH v2 00/12] LUKS decryption with Clevis+Tang
> Message-Id: <20220630122028.19283-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029352.html

Upstream commit range 35467027f657..af6cb55bc58a.

Comment 20 Laszlo Ersek 2022-07-01 13:15:33 UTC
(In reply to Laszlo Ersek from comment #18)

> [libguestfs PATCH v2 0/3] LUKS decryption with Clevis+Tang
> Message-Id: <20220630122048.19335-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029365.html

Upstream commit range 99844660b48e..6a5b44f53806.

Comment 21 Laszlo Ersek 2022-07-01 13:28:14 UTC
(In reply to Laszlo Ersek from comment #16)

> [guestfs-tools PATCH 0/4] LUKS decryption with Clevis+Tang
> Message-Id: <20220628115702.5584-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029293.html

Upstream commit range b2e7de29b413..1cce13223e93.

Comment 22 Laszlo Ersek 2022-07-01 13:37:05 UTC
(In reply to Laszlo Ersek from comment #16)

> [v2v PATCH] convert: document networking dependency of "--key ID:clevis"
> Message-Id: <20220628115856.5820-1-lersek>
> https://listman.redhat.com/archives/libguestfs/2022-June/029298.html

Upstream commit 98fa5ab26853.

Comment 25 YongkuiGuo 2022-07-04 09:22:01 UTC
Hi,lersek

After installing clevis-luks and libguestfs (1.48.3-4.el9) related packages on the rhel9.1 host, clevis-luks is not included in /usr/lib64/guestfs/supermin.d/packages file. Thus I have to add clevis-luks manually to enable the clevisluks feature.  Maybe we need to change the packages list in supermin.

Comment 26 Richard W.M. Jones 2022-07-04 10:04:05 UTC
This is actually a problem in the libguestfs packaging.  I should have added:

BuildRequires: clevis-luks

to the spec file, but I forgot to.  Sorry.  I will fix this shortly.

Comment 27 YongkuiGuo 2022-07-04 10:16:30 UTC
(In reply to Richard W.M. Jones from comment #26)
> This is actually a problem in the libguestfs packaging.  I should have added:
> 
> BuildRequires: clevis-luks
> 
> to the spec file, but I forgot to.  Sorry.  I will fix this shortly.

Thanks a lot.

Comment 28 Richard W.M. Jones 2022-07-05 11:29:43 UTC
This is also fixed in virt-v2v-2.0.6-3.el9

Comment 29 tingting zheng 2022-07-06 02:22:52 UTC
(In reply to Richard W.M. Jones from comment #28)
> This is also fixed in virt-v2v-2.0.6-3.el9

Vera, would you pls help to test the bug from virt-v2v side, thanks.

Comment 30 YongkuiGuo 2022-07-07 08:57:09 UTC
Verified with packages:
libguestfs-1.48.3-5.el9.x86_64
guestfs-tools-1.48.2-4.el9.x86_64


Steps:

1. Create a VM (rhel9-luks) with an encrypted disk
[root@localhost ~]# lsblk
NAME                                          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                            11:0    1 1024M  0 rom   
vda                                           252:0    0   10G  0 disk  
├─vda1                                        252:1    0    1G  0 part  /boot
└─vda2                                        252:2    0    9G  0 part  
  └─luks-c589defb-6e58-4bcc-a09c-2bd455396fe9 253:0    0    9G  0 crypt 
    ├─rhel-swap                               253:1    0    1G  0 lvm   [SWAP]
    └─rhel-root                               253:2    0    8G  0 lvm   /

2. Create another VM (rhel9-tang) 
3. Install clevis client on rhel9-luks VM and tang server on rhel9-tang VM respectively according to doc: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/security_hardening/index#configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening

4.
# virsh list
 Id   Name         State
----------------------------
 10   rhel9-tang   running
 25   rhel9-luks   running

5.
# LIBGUESTFS_BACKEND=direct guestfish --ro -d rhel9-luks --network

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

><fs> run
><fs> clevis-luks-unlock /dev/sda2 clevis
><fs> lvm-scan true
><fs> list-filesystems 
/dev/sda1: xfs
/dev/rhel/root: xfs
/dev/rhel/swap: swap
><fs> mount /dev/rhel/root /
><fs> ll /
total 24
dr-xr-xr-x.  18 root root  235 Jun 30 08:10 .
drwxr-xr-x   20 root root 4096 Jul  7 08:25 ..
dr-xr-xr-x.   2 root root    6 Aug  9  2021 afs
lrwxrwxrwx.   1 root root    7 Aug  9  2021 bin -> usr/bin
drwxr-xr-x.   2 root root    6 Jun 30 08:10 boot
drwxr-xr-x.   2 root root    6 Jun 30 08:10 dev
drwxr-xr-x. 108 root root 8192 Jul  7 08:19 etc
drwxr-xr-x.   2 root root   71 Jun 30 10:33 home
lrwxrwxrwx.   1 root root    7 Aug  9  2021 lib -> usr/lib
lrwxrwxrwx.   1 root root    9 Aug  9  2021 lib64 -> usr/lib64
drwxr-xr-x.   2 root root    6 Aug  9  2021 media
drwxr-xr-x.   2 root root    6 Aug  9  2021 mnt
drwxr-xr-x.   3 root root   16 Jun 30 08:11 opt
drwxr-xr-x.   2 root root    6 Jun 30 08:10 proc
dr-xr-x---.   2 root root  167 Jun 30 10:33 root
drwxr-xr-x.   2 root root    6 Jun 30 08:10 run
lrwxrwxrwx.   1 root root    8 Aug  9  2021 sbin -> usr/sbin
drwxr-xr-x.   2 root root    6 Aug  9  2021 srv
drwxr-xr-x.   2 root root    6 Jun 30 08:10 sys
drwxrwxrwt.   9 root root 4096 Jul  7 08:19 tmp
drwxr-xr-x.  12 root root  144 Jun 30 08:10 usr
drwxr-xr-x.  20 root root 4096 Jun 30 08:18 var

><fs> quit

6. Shutdown the rhel9-luks VM
# virsh  shutdown rhel9-luks

7. Check virt-cat command
# virt-cat -a /var/lib/libvirt/images/rhel9-luks.qcow2 --key /dev/sda2:clevis /etc/fstab 
...
/dev/mapper/rhel-root   /                       xfs     defaults,x-systemd.device-timeout=0 0 0
UUID=af534f34-cac4-4654-88c5-9c58556ce71c /boot                   xfs     defaults        0 0
/dev/mapper/rhel-swap   none                    swap    defaults,x-systemd.device-timeout=0 0 0

8. Test virt-customize command
# virt-customize -a /var/lib/libvirt/images/rhel9-luks.qcow2 --key /dev/sda2:clevis --touch /root/testfile 
[   0.0] Examining the guest ...
[  11.7] Setting a random seed
[  11.7] Running touch: /root/testfile
[  11.7] SELinux relabelling
[  27.7] Finishing off

9. Test virt-sysprep command
# virt-sysprep -a /home/images/rhel9-luks-clevis.qcow2 --key /dev/sda2:clevis
[   0.0] Examining the guest ...
[  11.6] Performing "abrt-data" ...
...
[  12.7] SELinux relabelling
[  28.3] Performing "lvm-uuids" ...

The results of the above commands are correct.

Comment 31 Laszlo Ersek 2022-07-07 11:14:10 UTC
Thanks for the test!

Comment 32 YongkuiGuo 2022-07-12 08:04:04 UTC
Tested this bug with LUKS-on-LV:

Steps:

1. Create a VM (rhel9-luks-on-lv) with luks encrypted LVs
[root@localhost ~]# lsblk
NAME                                            MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                              11:0    1 1024M  0 rom   
vda                                             252:0    0   10G  0 disk  
├─vda1                                          252:1    0    1G  0 part  /boot
└─vda2                                          252:2    0    9G  0 part  
  ├─rhel-root                                   253:0    0    8G  0 lvm   
  │ └─luks-e5b1101d-815d-4f07-a19e-8f739bcc3657 253:2    0    8G  0 crypt /
  └─rhel-swap                                   253:1    0    1G  0 lvm   
    └─luks-91fdf893-fff9-40b1-be55-ddb872501e72 253:3    0 1008M  0 crypt [SWAP]

2. Use the same tang server in comment 30

3.
# LIBGUESTFS_BACKEND=direct guestfish --ro -d rhel9-luks-on-lv --network

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

><fs> run
><fs> lvs
/dev/rhel/root
/dev/rhel/swap
><fs> vfs-type /dev/rhel/root
crypto_LUKS
><fs> vfs-type /dev/rhel/swap
crypto_LUKS
><fs> clevis-luks-unlock /dev/rhel/root clevis
><fs> lvm-scan true
><fs> list-filesystems 
/dev/mapper/clevis: xfs
/dev/sda1: xfs
><fs> mount /dev/mapper/clevis /
><fs> ls /
afs
bin
boot
dev
etc
home
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
><fs>

4.Shutdown the rhel9-luks-on-lv VM
# virsh  shutdown rhel9-luks-on-lv

5.
# virt-cat -a /var/lib/libvirt/images/rhel9-luks-on-lv.qcow2 --key /dev/rhel/root:clevis --key /dev/rhel/swap:clevis /etc/fstab
...
/dev/mapper/luks-e5b1101d-815d-4f07-a19e-8f739bcc3657 /                       xfs     defaults,x-systemd.device-timeout=0 0 0
UUID=ea634d05-85a5-4c27-9364-bc894eb0e11d /boot                   xfs     defaults        0 0
/dev/mapper/luks-91fdf893-fff9-40b1-be55-ddb872501e72 none                    swap    defaults,x-systemd.device-timeout=0 0 0

6.
# virt-customize -a /var/lib/libvirt/images/rhel9-luks-on-lv.qcow2 --key /dev/rhel/root:clevis --key /dev/rhel/swap:clevis --touch /root/testfile
[   0.0] Examining the guest ...
[  16.9] Setting a random seed
[  17.0] Running touch: /root/testfile
[  17.0] SELinux relabelling
[  32.5] Finishing off

7.
# virt-sysprep  -a /var/lib/libvirt/images/rhel9-luks-on-lv.qcow2 --key /dev/rhel/root:clevis --key /dev/rhel/swap:clevis
[   0.0] Examining the guest ...
[  16.7] Performing "abrt-data" ...
[  16.7] Performing "backup-files" ...
[  17.2] Performing "bash-history" ...
[  17.2] Performing "blkid-tab" ...
[  17.2] Performing "crash-data" ...
[  17.2] Performing "cron-spool" ...
[  17.2] Performing "dhcp-client-state" ...
[  17.2] Performing "dhcp-server-state" ...
[  17.2] Performing "dovecot-data" ...
[  17.2] Performing "ipa-client" ...
[  17.3] Performing "kerberos-hostkeytab" ...
[  17.3] Performing "logfiles" ...
[  17.3] Performing "lvm-system-devices" ...
[  17.3] Performing "machine-id" ...
[  17.3] Performing "mail-spool" ...
[  17.3] Performing "net-hostname" ...
[  17.3] Performing "net-hwaddr" ...
[  17.4] Performing "net-nmconn" ...
[  17.4] Performing "pacct-log" ...
[  17.4] Performing "package-manager-cache" ...
[  17.4] Performing "pam-data" ...
[  17.4] Performing "passwd-backups" ...
[  17.4] Performing "puppet-data-log" ...
[  17.4] Performing "rh-subscription-manager" ...
[  17.5] Performing "rhn-systemid" ...
[  17.5] Performing "rpm-db" ...
[  17.5] Performing "samba-db-log" ...
[  17.5] Performing "script" ...
[  17.5] Performing "smolt-uuid" ...
[  17.5] Performing "ssh-hostkeys" ...
[  17.5] Performing "ssh-userdir" ...
[  17.5] Performing "sssd-db-log" ...
[  17.6] Performing "tmp-files" ...
[  17.6] Performing "udev-persistent-net" ...
[  17.6] Performing "utmp" ...
[  17.6] Performing "yum-uuid" ...
[  17.6] Performing "customize" ...
[  17.6] Setting a random seed
[  17.6] Setting the machine ID in /etc/machine-id
[  17.6] SELinux relabelling
[  33.0] Performing "lvm-uuids" ...
virt-sysprep: error: libguestfs error: vg_activate_all: vgchange:   Logical 
volume rhel/root is used by another device.
  Can't deactivate volume group "rhel" with 2 open logical volume(s)

If reporting bugs, run virt-sysprep with debugging enabled and include the 
complete output:

  virt-sysprep -v -x [...]



Hi lersek, the virt-sysprep command failed with the above error. I will attach the whole output soon.

Comment 34 Laszlo Ersek 2022-07-12 08:25:19 UTC
Hi YongkuiGuo,

thanks for the test.

This time I don't think I need the -v -x output; I think I can tell what's happening from the virt-sysprep source code, and the lsblk output you provided above.

The problem is not related to Clevis+Tang; it is related to LUKS-on-LV.

The "lvm-uuids" operation of virt-sysprep *seems* to assume that logical volumes are the leaf nodes in the block device tree. The source code goes like this:

      if has_pvs || has_vgs then g#vg_activate_all false;
      if has_pvs then g#pvchange_uuid_all ();
      if has_vgs then g#vgchange_uuid_all ();
      if has_pvs || has_vgs then g#vg_activate_all true

What surely fails here is the initial deactivation, and the reason for that failure must be that the decrypted block devices (decrypted with Clevis+Tang, or otherwise, doesn't matter) keep the LVs open. I think the original assumption was here that simply unmounting (?) the filesystems would make the LVs un-referenced, and therefore possible to deactive. With LUKS between LVs and filesystems, that's no longer the case.

Now, this is at least partially speculation on my part, as I can't yet tell where said "umount" is supposed to happen -- that is, why lvm-uuids works on a LUKS-less disk image!

... A-ha! I do know that now. The virt-sysprep operations are categorized in (at least!) two classes, filesystem-based and device-based. See "perform_on_filesystems" and "perform_on_devices" in "sysprep/sysprep_operation.mli":

  perform_on_devices : device_side_effects callback option;
  (** This is the same as {!perform_on_filesystems} except that
      the guest filesystem(s) are {i not} mounted.  This allows the
      operation to work directly on block devices, LVs etc. *)

and in "sysprep/main.ml" we have:

        (* Perform the filesystem operations. *)
        Sysprep_operation.perform_operations_on_filesystems
          ?operations g root side_effects;

        [...]

        (* Unmount everything in this guest. *)
        g#umount_all ();

        [...]

        (* Perform the block device operations. *)
        Sysprep_operation.perform_operations_on_devices
          ?operations g root side_effects;

So that's where our "umount" is -- and it confirms my analysis.

In other words, this is a corner case that we all missed when fixing and verifying bug 1658126.

Please file a new bug about this -- but the reproducer there need not use Clevis+Tang, simple LUKS-on-LV should be sufficient.

Also, this problem should not block the verification of the present BZ; I think you can move this to VERIFIED.

Thanks!
Laszlo

Comment 35 Laszlo Ersek 2022-07-12 08:32:33 UTC
Actually, YongkuiGuo,

please wait a minute with filing a new bug!

I'm recalling a discussion about distributing encrypted guest disk images. I can't remember *where* I read it, but the gist of the discussion was this:

* never distribute encrypted guest disk images; never use them as "master" images that are to be cloned as actual virtual machine disk images.

The reason for this advice (according to the article) was that all those disk images (cloned from the same "master image") would use the same *internal* LUKS master key. Even if you change the particular keys in the LUKS keyslots, those would not change the internal master key. Changing the internal master key actually requires decrypting and re-encrypting the whole disk, sector by sector. And distributing multiple images with identical (not randomly generated, unique) internal master keys is a big NO-NO cryptographically speaking.

Now, consider the purpose of virt-sysprep: it aims directly at scrubbing a VM disk image so that it can be used later as a VM master image.

In other words, virt-sysprep should not be used *AT ALL* on encrypted disks. With that, I'm tempted to qualify your virt-sysprep test, on *any* disk image that uses LUKS (LUKS-on-LV or LV-on-LUKS, does not matter) invalid.

Comment 36 Laszlo Ersek 2022-07-12 08:39:36 UTC
Found it -- it's in the LUKS FAQ:

https://gitlab.com/cryptsetup/cryptsetup/-/blob/main/FAQ.md

> CLONING/IMAGING: If you clone or image a LUKS container, you make a
> copy of the LUKS header and the volume key will stay the same!  That
> means that if you distribute an image to several machines, the same
> volume key will be used on all of them, regardless of whether you
> change the passphrases.  Do NOT do this!  If you do, a root-user on
> any of the machines with a mapped (decrypted) container or a
> passphrase on that machine can decrypt all other copies, breaking
> security.  See also Item 6.15.
>
> [...]
>
> 6.15 Can I clone a LUKS container?
>
> You can, but it breaks security, because the cloned container has the
> same header and hence the same volume key.  Even if you change the
> passphrase(s), the volume key stays the same.  That means whoever has
> access to one of the clones can decrypt them all, completely bypassing
> the passphrases.
>
> While you can use cryptsetup-reencrypt to change the volume key, this
> is probably more effort than to create separate LUKS containers in the
> first place.
>
> The right way to do this is to first luksFormat the target container,
> then to clone the contents of the source container, with both
> containers mapped, i.e.  decrypted.  You can clone the decrypted
> contents of a LUKS container in binary mode, although you may run into
> secondary issues with GUIDs in filesystems, partition tables,
> RAID-components and the like.  These are just the normal problems
> binary cloning causes.
>
> Note that if you need to ship (e.g.) cloned LUKS containers with a
> default passphrase, that is fine as long as each container was
> individually created (and hence has its own volume key).  In this
> case, changing the default passphrase will make it secure again.

Comment 37 Laszlo Ersek 2022-07-12 08:45:48 UTC
And compare how the virt-sysprep manual starts:

https://libguestfs.org/virt-sysprep.1.html

> DESCRIPTION
> 
> Virt-sysprep can reset or unconfigure a virtual machine so that clones can be made from it. [...]

I think we do need a new BZ, but that new BZ should be about updating the virt-sysprep documentation. The DESCRIPTION section should point out early that disk images that are internally encrypted with LUKS (in any LVM scheme) are out of scope for virt-sysprep -- and add a link to the LUKS FAQ.

Theoretically, virt-builder could be extended with re-encryption (changing the master key), but it would be a huge amount of work... I'm not sure it's worth it. I think for now a new BZ suffices for updating the DESCRIPTION of virt-sysprep, with the support statement / warning.

And, again, I think the present BZ can go to VERIFIED.

Comment 38 YongkuiGuo 2022-07-12 09:02:41 UTC
Hi,lersek
 
Thanks for your quick response and so many details. I will file a new bug for tracking this change that virt-sysprep will no longer support any encrypted disk images with LUKS on LV. We cannot set this BZ to VERIFIED status until the verification is no problem from virt-v2v side.

Comment 39 Laszlo Ersek 2022-07-12 09:12:07 UTC
(In reply to YongkuiGuo from comment #38)
> We cannot set this BZ to VERIFIED
> status until the verification is no problem from virt-v2v side.

You are completely right of course; I forgot that the tests in comment 30 didn't include virt-v2v!

Comment 40 Richard W.M. Jones 2022-07-12 10:12:23 UTC
(In reply to Laszlo Ersek from comment #35)
> * never distribute encrypted guest disk images; never use them as "master"
> images that are to be cloned as actual virtual machine disk images.

An even more practical reason is they are gigantic and impossible to compress.
The smallest Windows bitlocker disk I could make (for testing purposes) was
38M xz compressed and it has only a couple of files in it.

> The reason for this advice (according to the article) was that all those
> disk images (cloned from the same "master image") would use the same
> *internal* LUKS master key.

Indeed.  The passphrase slots only encrypt a master key.

> Even if you change the particular keys in the
> LUKS keyslots, those would not change the internal master key. Changing the
> internal master key actually requires decrypting and re-encrypting the whole
> disk, sector by sector.

All true.

> And distributing multiple images with identical (not
> randomly generated, unique) internal master keys is a big NO-NO
> cryptographically speaking.
> 
> Now, consider the purpose of virt-sysprep: it aims directly at scrubbing a
> VM disk image so that it can be used later as a VM master image.
> 
> In other words, virt-sysprep should not be used *AT ALL* on encrypted disks.
> With that, I'm tempted to qualify your virt-sysprep test, on *any* disk
> image that uses LUKS (LUKS-on-LV or LV-on-LUKS, does not matter) invalid.

Having said that, we probably shouldn't break on these disks.

Comment 41 Richard W.M. Jones 2022-07-12 10:15:00 UTC
(In reply to Laszlo Ersek from comment #37)
> Theoretically, virt-builder could be extended with re-encryption (changing
> the master key), but it would be a huge amount of work... I'm not sure it's
> worth it. I think for now a new BZ suffices for updating the DESCRIPTION of
> virt-sysprep, with the support statement / warning.

I think there's definitely a future feature for virt-builder where it
introduces qemu block layer LUKS encryption (so not LUKS inside the guest).
This is actually not very hard.

However reencrypting an existing encrypted template would indeed be a bad
idea for all the reasons you outlined already.

Comment 42 Laszlo Ersek 2022-07-12 10:18:36 UTC
(In reply to Richard W.M. Jones from comment #40)
> (In reply to Laszlo Ersek from comment #35)

> > In other words, virt-sysprep should not be used *AT ALL* on encrypted disks.
> > With that, I'm tempted to qualify your virt-sysprep test, on *any* disk
> > image that uses LUKS (LUKS-on-LV or LV-on-LUKS, does not matter) invalid.
> 
> Having said that, we probably shouldn't break on these disks.

Well we can use some heuristics: expand the /dev/mapper/luks-* glob (because the inspection creates such device nodes), likely with guestfs_glob_expand_opts(), and then call guestfs_cryptsetup_close() on each, ignoring erros. This should be done between the "umount_all" call and the "perform_operations_on_devices" call. This should do the right thing when LUKS-on-LV is in use, and (justifiedly) fail when LV-on-LUKS is in use (hence the ignoring of errors).

I guess this would not be 100% foolproof, but could accompany (as "best effort") the documentation update.

Comment 43 Vera 2022-07-20 09:49:10 UTC
Tried on virt-v2v side as Comment28:
virt-v2v-2.0.7-2.el9.x86_64
libguestfs-1.48.4-1.el9.x86_64
guestfs-tools-1.48.2-5.el9.x86_64

Steps:
1.  Create 2 VMs (rhel9-luks and rhel9-luks-on-lv) with luks encrypted LVs
[root@localhost ~]# lsblk
NAME                                          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                            11:0    1 1024M  0 rom   
vda                                           252:0    0   10G  0 disk  
├─vda1                                        252:1    0    1G  0 part  /boot
└─vda2                                        252:2    0    9G  0 part  
  └─luks-c589defb-6e58-4bcc-a09c-2bd455396fe9 253:0    0    9G  0 crypt 
    ├─rhel-swap                               253:1    0    1G  0 lvm   [SWAP]
    └─rhel-root                               253:2    0    8G  0 lvm   /

[root@localhost ~]# lsblk
NAME                                            MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                              11:0    1 1024M  0 rom   
vda                                             252:0    0   10G  0 disk  
├─vda1                                          252:1    0    1G  0 part  /boot
└─vda2                                          252:2    0    9G  0 part  
  ├─rhel-root                                   253:0    0    8G  0 lvm   
  │ └─luks-e5b1101d-815d-4f07-a19e-8f739bcc3657 253:2    0    8G  0 crypt /
  └─rhel-swap                                   253:1    0    1G  0 lvm   
    └─luks-91fdf893-fff9-40b1-be55-ddb872501e72 253:3    0 1008M  0 crypt [SWAP]

2. Create a tang VM (rhel9-tang)

3. Install clevis client on both rhel9-luks and rhel9-luks-on-lv and configure tang server on rhel9-tang VM respectively according to the doc

4. Start VMs and check if the configurations are correct: VM can start into login screen without passwords.

5. Keep Tang server running and shutdown the another 2 VMs, then try to convert VMs via v2v:

# virt-v2v -ic qemu:///system -o rhv-upload -of qcow2 -os nfs_data -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /v2v-ops/rhvpasswd -oo rhv-cafile=/v2v-ops/ca22.pem -oo rhv-cluster=NFS -oo rhv-direct --mac 52:54:00:93:5c:ec:network:ovirtmgmt rhel9-luks
[   0.0] Setting up the source: -ic qemu:///system -i libvirt rhel9-luks
[   1.1] Opening the source
Enter key or passphrase ("/dev/sda2"): 
[  13.1] Inspecting the source
[  16.0] Checking for sufficient free disk space in the guest
[  16.0] Converting Red Hat Enterprise Linux 9.1 Beta (Plow) to run on KVM


# virt-v2v -ic qemu:///system -o rhv-upload -of qcow2 -os nfs_data -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /v2v-ops/rhvpasswd -oo rhv-cafile=/v2v-ops/ca22.pem -oo rhv-cluster=NFS -oo rhv-direct --mac 52:54:00:93:5c:ec:network:ovirtmgmt rhel9-luks-on-lv
[   0.0] Setting up the source: -ic qemu:///system -i libvirt rhel9-luks-on-lv
[   1.1] Opening the source
Enter key or passphrase ("/dev/rhel/root"): 

virt-v2v still needs password to convert.
===============================

Hi, rjones & Laszlo:
I think the expecting result is that virt-v2v can convert the VMs without any passwords. No need to input any passwords. 
Is it correct?

Thanks,
Vera

Comment 44 Laszlo Ersek 2022-07-20 10:14:07 UTC
Hello Vera,

please pass the following option on the virt-v2v command line:

  --key /dev/sda2:clevis

Please refer to

  --key ID:clevis

at <https://libguestfs.org/virt-v2v.1.html>.

Comment 45 Laszlo Ersek 2022-07-20 10:15:27 UTC
(Also, "--key /dev/rhel/root:clevis".)

Comment 46 Vera 2022-07-22 05:56:53 UTC
(In reply to Laszlo Ersek from comment #45)
> (Also, "--key /dev/rhel/root:clevis".)

Thanks Laszlo for pointing out.

Tried on:
virt-v2v-2.0.7-2.el9.x86_64
libguestfs-1.48.4-1.el9.x86_64
guestfs-tools-1.48.2-5.el9.x86_64

Steps:
1.  Create 2 VMs (rhel9-luks and rhel9-luks-on-lv) with luks encrypted LVs
[root@localhost ~]# lsblk
NAME                                          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                            11:0    1 1024M  0 rom   
vda                                           252:0    0   10G  0 disk  
├─vda1                                        252:1    0    1G  0 part  /boot
└─vda2                                        252:2    0    9G  0 part  
  └─luks-c589defb-6e58-4bcc-a09c-2bd455396fe9 253:0    0    9G  0 crypt 
    ├─rhel-swap                               253:1    0    1G  0 lvm   [SWAP]
    └─rhel-root                               253:2    0    8G  0 lvm   /

[root@localhost ~]# lsblk
NAME                                            MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sr0                                              11:0    1 1024M  0 rom   
vda                                             252:0    0   10G  0 disk  
├─vda1                                          252:1    0    1G  0 part  /boot
└─vda2                                          252:2    0    9G  0 part  
  ├─rhel-root                                   253:0    0    8G  0 lvm   
  │ └─luks-e5b1101d-815d-4f07-a19e-8f739bcc3657 253:2    0    8G  0 crypt /
  └─rhel-swap                                   253:1    0    1G  0 lvm   
    └─luks-91fdf893-fff9-40b1-be55-ddb872501e72 253:3    0 1008M  0 crypt [SWAP]

2. Create a tang VM (rhel9-tang)

3. Install clevis client on both rhel9-luks and rhel9-luks-on-lv and configure tang server on rhel9-tang VM respectively according to the doc

4. Start VMs and check if the configurations are correct: VM can start into login screen without passwords.

5. Keep Tang server running and shutdown the another 2 VMs, then try to convert VMs via v2v.
# virsh list --all |grep rhel9
 1    rhel9-tang                                 running
 -    rhel9-luks                                 shut off
 -    rhel9-luks-on-lv                           shut off

# virt-v2v -ic qemu:///system -o rhv-upload -of qcow2 -os nfs_data -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /v2v-ops/rhvpasswd -oo rhv-cafile=/v2v-ops/ca22.pem -oo rhv-cluster=Default -oo rhv-direct --mac 52:54:00:93:5c:ec:network:ovirtmgmt rhel9-luks --key "/dev/sda2":clevis
[   0.0] Setting up the source: -ic qemu:///system -i libvirt rhel9-luks
[   1.1] Opening the source
[   8.9] Inspecting the source
[  11.7] Checking for sufficient free disk space in the guest
[  11.7] Converting Red Hat Enterprise Linux 9.1 Beta (Plow) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  68.0] Mapping filesystem data to avoid copying unused and blank areas
virt-v2v: warning: fstrim on guest filesystem /dev/rhel/root failed.  
Usually you can ignore this message.  To find out more read "Trimming" in 
virt-v2v(1).

Original message: fstrim: fstrim: /sysroot/: the discard operation is not 
supported
[  69.9] Closing the overlay
[  70.2] Assigning disks to buses
[  70.2] Checking if the guest needs BIOS or UEFI to boot
[  70.2] Setting up the destination: -o rhv-upload -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data
[ 111.0] Copying disk 1/1
█ 100% [****************************************]
[ 193.3] Creating output metadata
[ 207.6] Finishing off

# virt-v2v -ic qemu:///system -o rhv-upload -of qcow2 -os nfs_data -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /v2v-ops/rhvpasswd -oo rhv-cafile=/v2v-ops/ca22.pem -oo rhv-cluster=Default -oo rhv-direct --mac 52:54:00:93:5c:ec:network:ovirtmgmt rhel9-luks-on-lv --key "/dev/rhel/root":clevis --key "/dev/rhel/swap":clevis
[   0.0] Setting up the source: -ic qemu:///system -i libvirt rhel9-luks-on-lv
[   1.1] Opening the source
[  13.4] Inspecting the source
[  16.2] Checking for sufficient free disk space in the guest
[  16.2] Converting Red Hat Enterprise Linux 9.1 Beta (Plow) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  72.6] Mapping filesystem data to avoid copying unused and blank areas
virt-v2v: warning: fstrim on guest filesystem 
/dev/mapper/luks-e5b1101d-815d-4f07-a19e-8f739bcc3657 failed.  Usually you 
can ignore this message.  To find out more read "Trimming" in virt-v2v(1).

Original message: fstrim: fstrim: /sysroot/: the discard operation is not 
supported
[  73.6] Closing the overlay
[  73.9] Assigning disks to buses
[  73.9] Checking if the guest needs BIOS or UEFI to boot
[  73.9] Setting up the destination: -o rhv-upload -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data
[  86.7] Copying disk 1/1
█ 100% [****************************************]
[ 135.6] Creating output metadata
[ 151.9] Finishing off

6. Start VM and all checkpoints are passed.

Comment 47 YongkuiGuo 2022-07-25 02:35:17 UTC
Verified this bug per comments 30, 32, and 46.

Comment 49 errata-xmlrpc 2022-11-15 09:52:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: libguestfs security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7958


Note You need to log in before you can comment on or make changes to this bug.