Bug 2130192
Summary: | RFE: support live migrating TPM state to a target that shares storage with the source | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Jed Lejosne <jlejosne> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
libvirt sub component: | General | QA Contact: | Yanqiu Zhang <yanqzhan> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | unspecified | |||
Priority: | unspecified | CC: | coli, danken, dzheng, jdenemar, jsuchane, lcheng, lmen, mprivozn, mtessun, qcheng, virt-maint, xuzhang, yanqzhan, ymankad | |
Version: | 9.0 | Keywords: | FutureFeature, Triaged, Upstream | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-9.0.0-1.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2152977 (view as bug list) | Environment: | ||
Last Closed: | 2023-05-09 07:27:15 UTC | Type: | Feature Request | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | 9.0.0 | |
Embargoed: | ||||
Bug Depends On: | 2092944, 2152977 | |||
Bug Blocks: |
Description
Jed Lejosne
2022-09-27 13:14:14 UTC
According to https://listman.redhat.com/archives/libvir-list/2022-October/234970.html swtpm 0.8 will be required, so I added a dependency on the swtpm rebase bug 2092944 and set ITR=9.2.0 under the assumption that Stefan will merge the changes into libvirt soon and create the swtpm-0.8 which Marc-Andre can then rebase downstream. Not setting a DTM yet as I'm not sure which libvirt release the changes will hit. v4 landed on the list: https://listman.redhat.com/archives/libvir-list/2022-October/235089.html Merged upstream as: 3c9968ec9a qemu: tpm: Never remove state on outgoing migration and shared storage 2e669ec789 qemu: tpm: Avoid security labels on incoming migration with shared storage 188dfeb398 qemu: tpm: Pass --migration option to swtpm if supported and needed 5597476e40 qemu: tpm: Add support for storing private TPM-related data 68103e9daf qemu: tpm: Conditionally create storage on incoming migration 384138d790 qemu: tpm: Introduce qemuTPMHasSharedStorage() 1537c73da2 util: Add parsing support for swtpm's cmdarg-migration capability v8.9.0-210-g3c9968ec9a Tested with NFS, meet an issue that 'migrating back' fails. # rpm -q libvirt qemu-kvm swtpm libvirt-8.10.0-1.fc37.x86_64(v8.9.0-259-gc9a65eb8a6 ) qemu-kvm-7.1.0-3.fc38.x86_64 swtpm-0.8.0-2.fc38.x86_64 The NFS: [root@dell-per*** ~]# cat /etc/exports /test/images *(rw,async,no_root_squash) /test/swtpm *(rw,async,no_root_squash) On both hosts: # df -h dell-per***:/test/images 70G 22G 49G 31% /nfs-images dell-per***:/test/swtpm 70G 22G 49G 31% /var/lib/libvirt/swtpm # getenforce Enforcing On hostA: # virsh define avocado-vt-vm1.xml Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 6.0K Nov 21 02:58 tpm2-00.permall # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] [hostB]# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose error: operation failed: swtpm died and reported: 2022-11-21 09:26:16.575+0000: 69914: debug : virExec:874 : Setting child security label to system_u:system_r:svirt_t:s0:c673,c797 2022-11-21 09:26:16.575+0000: 69914: debug : virExecCommon:462 : Setting child uid:gid to 59:59 with caps 0 if set on both:# setenforce 0 [hostB]# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose Migration: [100 %] Tested with ceph, migration never stops and has error in log: # df -hT 10.0.*.*:6789:/swtpm ceph 51G 3.3G 48G 7% /var/lib/libvirt/swtpm # ll -hZd /var/lib/libvirt/swtpm drwxr-xr-x. 2 root root unconfined_u:object_r:unlabeled_t:s0 0 Nov 22 03:38 /var/lib/libvirt/swtpm # virsh define avocado-vt-vm1.xml Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started # ll -hZd /var/lib/libvirt/swtpm drwxr-xr-x. 3 root root unconfined_u:object_r:unlabeled_t:s0 1 Nov 22 09:05 /var/lib/libvirt/swtpm # ll -hZR /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Nov 22 09:05 d2da0f21-40f6-46f2-869d-02d76faab3a7 … drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c782,c968 2 Nov 22 09:06 tpm2 … -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c782,c968 6.0K Nov 22 09:06 tpm2-00.permall # virsh dumpxml avocado-vt-vm1 |grep /tpm -B3 <tpm model='tpm-crb'> <backend type='emulator' version='2.0'/> <alias name='tpm0'/> </tpm> # ps aux|grep tpm … tss 79756 0.0 0.0 10756 6996 ? S 09:06 0:00 /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing qemu 79766 30.7 0.9 3716896 630812 ? Sl 09:06 0:35 /usr/bin/qemu-system-x86_64 -name guest=avocado-vt-vm1,...-chardev socket,id=chrtpm,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm -device {"driver":"tpm-crb","tpmdev":"tpm-tpm0","id":"tpm0"} # virsh console avocado-vt-vm1 Connected to domain 'avocado-vt-vm1' … [root@localhost ~]# tpm2_getrandom --hex 16 1dc19064b1b5d687e15aefe8a7e6ef5f[root@localhost ~]# # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose (it costs long time never stop, check target virtqemud.log there's error in it, then manually cancelling it can see error reported) ^C^C^C^Cerror: internal error: qemu unexpectedly closed the monitor: 2022-11-22T14:10:19.253647Z qemu-system-x86_64: tpm-emulator: TPM result for CMD_INIT: 0x101 operation failed hostB:# cat /var/log/libvirt/virtqemud.log ... 2022-11-22 14:10:18.937+0000: 102944: debug : virCommandRunAsync:2604 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing,incoming ... 2022-11-22 14:10:19.307+0000: 103082: error : qemuMonitorIORead:423 : Unable to read from monitor: Connection reset by peer ... 2022-11-22 14:10:19.308+0000: 103082: error : qemuProcessReportLogError:1959 : internal error: qemu unexpectedly closed the monitor: 2022-11-22T14:10:19.253647Z qemu-system-x86_64: tpm-emulator: TPM result for CMD_INIT: 0x101 operation failed hostB: # cat /var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied libtpms/tpm2: Entering failure mode; code: 8, location: NvPowerOn line 126 Error: Could not initialize libtpms. Error: Could not initialize the TPM Data client disconnected SWTPM_NVRAM_Lock_Dir: Could not open lockfile: Permission denied BTW: the same guest can be started on both source and target hosts. Hi Michal, Could you check my testings in comment7 and comment9 please? Do the two issues need to be further fixed or anything wrong with my test steps? Thank you! Supplementary for comment9: If # setenforce 0 , migration succeed: # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] hostB: drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Nov 22 22:43 d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c387,c763 2 Nov 22 22:50 tpm2 -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c387,c763 9.1K Nov 22 22:50 tpm2-00.permall target logs: 1).virtqemud.log: 2022-11-23 03:50:09.149+0000: 107076: debug : virCommandRunAsync:2604 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing,incoming (no errors) 2).avocado-vt-vm1-swtpm.log: (empty) SELinux is indeed a big concern here... Anything that uses SELinux MCS will require forcing the level of the files to "s0", i.e. no categories, or the files won't be accessible by another VM. Yeah, sorry for late reply, but I was trying to debug this and come up with a solution. No luck so far, but I've started a discussion with the author of the patches here: https://listman.redhat.com/archives/libvir-list/2022-November/235977.html (In reply to Michal Privoznik from comment #14) > Yeah, sorry for late reply, but I was trying to debug this and come up with > a solution. No luck so far, but I've started a discussion with the author of > the patches here: > > https://listman.redhat.com/archives/libvir-list/2022-November/235977.html That's okay. Looking forward to the conclusions. If this bug need be set back to 'assigned', please let me know. Thanks. Yeah, let's move it back to ASSIGNED. Alright, I've posted fixes here: https://listman.redhat.com/archives/libvir-list/2022-December/236067.html Merged upstream as: 713578d77f qemu_tpm: Set log file label on migration 3c2e55c5ed qemu_tpm: Extend start/stop APIs f3259f82fd security: Extend TPM label APIs v8.10.0-54-g713578d77f I believe this will fix both comment 7 and comment 9 problems, because the root cause for both is SELinux perms. Test with NFS pass. libvirt-9.0.0-1.fc37.x86_64(v8.10.0-85-g8a5a4e6dbd) swtpm-0.8.0-2.fc38.x86_64 qemu-kvm-7.1.0-3.fc38.x86_64 libtpms-0.9.5-2.fc37.x86_64 dell-per***:/test/swtpm 70G 22G 49G 31% /var/lib/libvirt/swtpm # getenforce Enforcing Steps: 1.Print swtpm capabilities # /usr/bin/swtpm socket --print-capabilities|grep cmdarg-migration { "type": "swtpm", "features": [ "tpm-1.2", "tpm-2.0", "tpm-send-command-header", "flags-opt-startup", "flags-opt-disable-auto-shutdown", "ctrl-opt-terminate", "cmdarg-seccomp", "cmdarg-key-fd", "cmdarg-pwd-fd", "cmdarg-print-states", "cmdarg-chroot", "cmdarg-migration", "nvram-backend-dir", "nvram-backend-file" ], "version": "0.8.0" } 2.Define vm, start vm # virsh define avocado-vt-vm1.xml Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started <tpm model='tpm-crb'> <backend type='emulator' version='2.0'/> <alias name='tpm0'/> </tpm> drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 50 Dec 9 02:57 /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 02:57 d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 02:57 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 6.0K Dec 9 02:57 tpm2-00.permall # ps aux|grep swtpm /libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing qemu 206991 68.1 0.3 3492932 213872 ? Sl 02:57 0:20 /usr/bin/qemu-system-x86_64 -name guest=avocado-vt-vm1,...-chardev socket,id=chrtpm,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm -device {"driver":"tpm-crb","tpmdev":"tpm-tpm0","id":"tpm0"} IN guest os check vtpm funtion: [root@localhost ~]# tpm2_getrandom --hex 16 54c94f243ca36ddfc9c07965abe6f90e 3. Migrate to/back # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] On hostB: drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 50 Dec 9 02:57 /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **02:57** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 03:00 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 03:00 tpm2-00.permall Check swtpm cmds in logs: (1)target swtpm.log is empty(means no swtpm_setup executed) (2)virtqemud.log on source: # grep 'About to run /usr/bin/swtpm' /var/log/libvirt/virtqemud.log 2022-12-09 07:56:14.886+0000: 206801: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --print-capabilities 2022-12-09 07:57:10.659+0000: 206802: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --tpm2 --tpm-state /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2 --vmid avocado-vt-vm1:d2da0f21-40f6-46f2-869d-02d76faab3a7 --logfile /var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --createek --create-ek-cert --create-platform-cert --lock-nvram --not-overwrite 2022-12-09 07:57:10.837+0000: 206802: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --print-capabilities 2022-12-09 07:57:10.861+0000: 206802: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing (3)virtqemud.log on target:(on swtpm_setup, only swtpm socket) # grep 'About to run /usr/bin/swtpm' /var/log/libvirt/virtqemud.log 2022-12-09 08:00:11.901+0000: 212343: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --print-capabilities 2022-12-09 08:00:12.374+0000: 212343: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --print-capabilities 2022-12-09 08:00:12.390+0000: 212343: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing,incoming Login vm on target host, vtpm still works well. Migrate back: [hostB]# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose Migration: [100 %] On hostA: drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **02:57** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 03:10 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 03:10 tpm2-00.permall Login vm, vtpm still works well. 4. Migrate to/back with –undefine-source –persistent –p2p # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose --p2p --undefinesource --persistent Migration: [100 %] On hostB: drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **02:57** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 03:18 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 03:18 tpm2-00.permall 2022-12-09 08:18:08.408+0000: 212858: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --print-capabilities 2022-12-09 08:18:08.882+0000: 212858: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --print-capabilities 2022-12-09 08:18:08.896+0000: 212858: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing,incoming [hostB]# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose --p2p --undefinesource --persistent Migration: [100 %] On hostA: drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **02:57** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 03:28 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 03:28 tpm2-00.permall 5. Migrate to/back a transient vm: # virsh create avocado-vt-vm1.xml Domain 'avocado-vt-vm1' created from avocado-vt-vm1.xml # virsh list --all --transient Id Name State -------------------------------- 1 avocado-vt-vm1 running # ps aux|grep swtpm tss 208361 0.2 0.0 10756 7120 ? S 03:52 0:00 /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing qemu 208369 84.8 0.9 3643196 643416 ? Sl 03:52 0:33 /usr/bin/qemu-system-x86_64 -name guest=avocado-vt-vm1,...-chardev socket,id=chrtpm,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm -device {"driver":"tpm-crb","tpmdev":"tpm-tpm0","id":"tpm0"} drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 50 Dec 9 03:52 /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **03:52** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 03:52 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 6.0K Dec 9 03:52 tpm2-00.permall # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose --p2p Migration: [100 %] On hostB: (1)Login vm , vtpm still works well. (2)drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **03:52** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 04:00 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 04:00 tpm2-00.permall (3)2022-12-09 09:00:11.786+0000: 213339: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --print-capabilities 2022-12-09 09:00:12.246+0000: 213339: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --print-capabilities 2022-12-09 09:00:12.261+0000: 213339: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --migration release-lock-outgoing,incoming [hostB]# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose --p2p Migration: [100 %] On hostA: Login vm , vtpm still works well. drwx--x--x. 3 root root system_u:object_r:nfs_t:s0 18 Dec 9 **03:52** d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:nfs_t:s0 42 Dec 9 04:07 tpm2 -rw-------. 1 tss tss system_u:object_r:nfs_t:s0 9.1K Dec 9 04:07 tpm2-00.permall 6.Migrate with lower version swtpm (1)Downgrade source swtpm to swtpm-0.7.3-2.20220427gitf2268ee.fc37.x86_64: # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose error: this function is not supported by the connection driver: the running swtpm does not support migration with shared storage (2)Downgrade target swtpm to v0.7.3-2: # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose error: argument unsupported: /usr/bin/swtpm (on destination side) does not support the --migration option needed for migration with shared storage Test on ceph, the same fail with comment9 still exists. Please help check. Thanks! libvirt-9.0.0-1.fc37.x86_64(v8.10.0-85-g8a5a4e6dbd) swtpm-0.8.0-2.fc38.x86_64 qemu-kvm-7.1.0-3.fc38.x86_64 libtpms-0.9.5-2.fc37.x86_64 10.0.*.*:6789:/swtpm 51G 3.3G 48G 7% /var/lib/libvirt/swtpm # virsh define avocado-vt-vm1.xml Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started # virsh dumpxml avocado-vt-vm1 |grep /tpm -B8 <tpm model='tpm-crb'> <backend type='emulator' version='2.0'> <encryption secret='0051c505-1ad0-4d77-9b3e-360c8f5e3b86'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> <alias name='tpm0'/> </tpm> drwxr-xr-x. 3 root root unconfined_u:object_r:unlabeled_t:s0 1 Dec 12 04:46 /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Dec 12 04:46 d2da0f21-40f6-46f2-869d-02d76faab3a7 drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c770,c950 2 Dec 12 04:46 tpm2 -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c770,c950 6.1K Dec 12 04:46 tpm2-00.permall wait for vm os fully boots up, # virsh managedsave avocado-vt-vm1 Domain 'avocado-vt-vm1' state saved by libvirt # virsh start avocado-vt-vm1 error: Failed to start domain 'avocado-vt-vm1' error: internal error: qemu unexpectedly closed the monitor: 2022-12-12T10:00:30.135840Z qemu-system-x86_64: tpm-emulator: TPM result for CMD_INIT: 0x101 operation failed (later try to make it back to work: # virsh managedsave-remove avocado-vt-vm1 Removed managedsave image for domain 'avocado-vt-vm1' # virsh start avocado-vt-vm1 error: Failed to start domain 'avocado-vt-vm1' error: Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/.lock which is already in use # rm -f /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/.lock # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started) Errors in swtpm log: Ending vTPM manufacturing @ Mon 12 Dec 2022 04:59:17 AM EST SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied libtpms/tpm2: Entering failure mode; code: 8, location: NvPowerOn line 126 Error: Could not initialize libtpms. Error: Could not initialize the TPM Data client disconnected SWTPM_NVRAM_Lock_Dir: Could not open lockfile: Permission denied The TPM's state will be encrypted using a key derived from a passphrase (fd). Swtpm cmds in virtqemud.log when managedsave and start: 2022-12-12 09:59:24.480+0000: 221543: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --key pwdfd=27,mode=aes-256-cbc --migration-key pwdfd=29,mode=aes-256-cbc --migration release-lock-outgoing 2022-12-12 10:00:29.814+0000: 221541: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/2-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --key pwdfd=28,mode=aes-256-cbc --migration-key pwdfd=30,mode=aes-256-cbc --migration release-lock-outgoing,incoming (In reply to yanqzhan from comment #20) > Test on ceph, the same fail with comment9 still exists. Please help check. > Thanks! > libvirt-9.0.0-1.fc37.x86_64(v8.10.0-85-g8a5a4e6dbd) > swtpm-0.8.0-2.fc38.x86_64 > qemu-kvm-7.1.0-3.fc38.x86_64 > libtpms-0.9.5-2.fc37.x86_64 > > 10.0.*.*:6789:/swtpm 51G 3.3G 48G 7% > /var/lib/libvirt/swtpm > > # virsh define avocado-vt-vm1.xml > Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml > # virsh start avocado-vt-vm1 > Domain 'avocado-vt-vm1' started > # virsh dumpxml avocado-vt-vm1 |grep /tpm -B8 > <tpm model='tpm-crb'> > <backend type='emulator' version='2.0'> > <encryption secret='0051c505-1ad0-4d77-9b3e-360c8f5e3b86'/> > <active_pcr_banks> > <sha256/> > </active_pcr_banks> > </backend> > <alias name='tpm0'/> > </tpm> > > drwxr-xr-x. 3 root root unconfined_u:object_r:unlabeled_t:s0 1 Dec 12 04:46 > /var/lib/libvirt/swtpm This looks very suspicious. > drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Dec 12 04:46 > d2da0f21-40f6-46f2-869d-02d76faab3a7 > drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c770,c950 2 Dec 12 > 04:46 tpm2 > -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c770,c950 6.1K Dec > 12 04:46 tpm2-00.permall But then these look good. > > wait for vm os fully boots up, > # virsh managedsave avocado-vt-vm1 > > Domain 'avocado-vt-vm1' state saved by libvirt > > # virsh start avocado-vt-vm1 > error: Failed to start domain 'avocado-vt-vm1' > error: internal error: qemu unexpectedly closed the monitor: > 2022-12-12T10:00:30.135840Z qemu-system-x86_64: tpm-emulator: TPM result for > CMD_INIT: 0x101 operation failed Alright, let me see whether I can reproduce. I don't have a ceph node set up, so I'll try NFS. Hopefully, I'll be able to reproduce. But CNV doesn't do managedsave, so we could track this in a different bug if needed. Let me get back with my analysis. (In reply to Michal Privoznik from comment #21) > > Alright, let me see whether I can reproduce. I don't have a ceph node set > up, so I'll try NFS. Hopefully, I'll be able to reproduce. But CNV doesn't > do managedsave, so we could track this in a different bug if needed. Let me > get back with my analysis. Oh sorry, same error with comment9 so I thought they're equivalent. I can reproduce it by migrate: # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose error: internal error: qemu unexpectedly closed the monitor: 2022-12-12T10:41:54.606669Z qemu-system-x86_64: tpm-emulator: TPM result for CMD_INIT: 0x101 operation failed Target swtpm log: SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied SWTPM_NVRAM_LoadData: Error (fatal) opening /var/lib/libvirt/swtpm/d2da0f21-40f6-46f2-869d-02d76faab3a7/tpm2/tpm2-00.permall for read, Permission denied libtpms/tpm2: Entering failure mode; code: 8, location: NvPowerOn line 126 Error: Could not initialize libtpms. Error: Could not initialize the TPM Data client disconnected SWTPM_NVRAM_Lock_Dir: Could not open lockfile: Permission denied Firstly, huge thanks to Yanqiu who let me use their ceph. It allowed me to find more issues with permissions. For the issues themselves: 1) I've been testing RHEL-9.2.0 beta (qemu-kvm-7.1.0-6.el9) and found that QEMU is missing couple of fixes that were merged in 7.2.0 time frame. Specifically, https://gitlab.com/qemu-project/qemu/-/commit/a0bcec03761477371ff7c2e80dc07fff14222d92 this is very important fix and it's surrounded by similar patches. I'll be cloning this BZ so that QEMU can pick them up. 2) I was testing my patches against NFS where 'virt_use_nfs' sebool allowed some operations which are not then allowed on ceph. Specifically, NFS does not propagate SELinux labels across all mount points, rather they are kept local. This is in contrast with ceph which does propagate SELinux labels. My patches set up label for SWTPM logfile (/var/log/swtpm/libvirt/qemu/...) but leave the state file (/var/lib/libvirt/swtpm/....) alone, assuming the correct label is set from the source. Well, this works on NFS because the aforementioned sebool basically disabled SELinux checks for the state file. But with ceph, we need to set the label on the destination just before QEMU tries to access the state (well, just before it tells swtpm to initialize itself). Long story short, there is more work needed. Therefore, I'm moving this back to ASSIGNED. Alright, after some discussion upstream, there's not much we can do here. Shared storage requires users to provide static seclabels because libvirt can not guarantee uniqueness of dynamic label across multiple hosts. I've merged couple of fixes though (around how seclabels are set, but more importantly - documentation of this fact). This is not different to disks btw. a677ea928a docs: Recommend static seclabels for migration on shared storage 10f9cb7705 qemu_security: Drop qemuSecurityStartTPMEmulator() 3d2dfec95b qemu_tpm: Open code qemuSecurityStartTPMEmulator() c0c52a9519 qemu_tpm: Restore TPM labels on failed start bdbb8e7b00 qemu_security: Introduce qemuSecuritySetTPMLabels() 51b92836ff qemu_security: Rename qemuSecurityCleanupTPMEmulator() 8d6e1f3764 qemu_security: Rework qemuSecurityCleanupTPMEmulator() v8.10.0-160-ga677ea928a Tested for ceph failed with new error: libvirt-9.0.0-1.fc38.x86_64(v9.0.0-rc1-5-ga5738ab74c) swtpm-0.8.0-2.fc38.x86_64 qemu-kvm-7.2.0-1.fc38.x86_64 libtpms-0.9.5-2.fc37.x86_64 10.73.*.*:6789:/ ceph 240G 6.9G 233G 3% /var/lib/libvirt/swtpm # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [ 95 %]error: operation failed: job 'migration out' unexpectedly failed There're errors in target virtqemud.log: 2023-01-11 11:47:13.939+0000: 2328: error : qemuProcessReportLogError:1971 : internal error: qemu unexpectedly closed the monitor: 2023-01-11T11:47:13.766393Z qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f 2023-01-11T11:47:13.766544Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator' 2023-01-11T11:47:13.771537Z qemu-system-x86_64: load of migration failed: Input/output error 2023-01-11 11:47:13.939+0000: 2328: debug : qemuMonitorIO:540 : Error on monitor internal error: qemu unexpectedly closed the monitor: 2023-01-11T11:47:13.766393Z qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f 2023-01-11T11:47:13.766544Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator' 2023-01-11T11:47:13.771537Z qemu-system-x86_64: load of migration failed: Input/output error mon=0x7f0e3c0312f0 vm=0x7f0e3c2b9030 name=avocado-vt-vm1 2023-01-11 11:47:14.144+0000: 2343: debug : qemuMigrationDstErrorSave:7091 : Saving incoming migration error for domain avocado-vt-vm1: internal error: qemu unexpectedly closed the monitor: 2023-01-11T11:47:13.766393Z qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f 2023-01-11T11:47:13.766544Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator' 2023-01-11T11:47:13.771537Z qemu-system-x86_64: load of migration failed: Input/output error Test for ceph passed after adding a static seclable for vm: libvirt-9.0.0-1.fc38.x86_64( v9.0.0-rc1-5-ga5738ab74c ) swtpm-0.8.0-2.fc38.x86_64 qemu-kvm-7.2.0-1.fc38.x86_64 libtpms-0.9.5-2.fc37.x86_64 10.73.*.*:6789:/ ceph 240G 11G 230G 5% /var/lib/libvirt/swtpm # virsh define avocado-vtpm.xml Domain 'avocado-vt-vm1' defined from avocado-vtpm.xml # virsh dumpxml avocado-vt-vm1 |grep -E '/tpm|/seclabel' -B7 <tpm model='tpm-crb'> <backend type='emulator' version='2.0'> <encryption secret='64d3f77b-45d6-4c8a-90e9-53b76b9acbab'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> </tpm> ... <seclabel type='static' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c392,c662</label> </seclabel> drwx--x--x. 3 root root system_u:object_r:virt_var_lib_t:s0 2 Jan 11 22:43 /var/lib/libvirt/swtpm # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started # virsh dumpxml avocado-vt-vm1 |grep -E '/tpm|/seclabel' -B8 <tpm model='tpm-crb'> <backend type='emulator' version='2.0'> <encryption secret='64d3f77b-45d6-4c8a-90e9-53b76b9acbab'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> <alias name='tpm0'/> </tpm> ... <seclabel type='static' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c392,c662</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c392,c662</imagelabel> </seclabel> drwx--x--x. 4 root root system_u:object_r:virt_var_lib_t:s0 3 Jan 11 22:45 /var/lib/libvirt/swtpm drwx--x--x. 3 root root system_u:object_r:virt_var_lib_t:s0 1 Jan 11 22:45 f4f8010a-b09d-4150-9f4d-8909b0b74091 drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c392,c662 2 Jan 11 22:46 tpm2 -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c392,c662 6.1K Jan 11 22:46 tpm2-00.permall # virsh console avocado-vt-vm1 [root@localhost ~]# tpm2_getrandom --hex 8 8feed32466b1579f # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] On target hostB: (1)# virsh list Id Name State -------------------------------- 1 avocado-vt-vm1 running (2)# grep 'About to run /usr/bin/swtpm' /var/log/libvirt/virtqemud.log 2023-01-12 03:47:15.787+0000: 7501: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm_setup --print-capabilities 2023-01-12 03:47:17.152+0000: 7501: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --print-capabilities 2023-01-12 03:47:17.355+0000: 7501: debug : virCommandRunAsync:2607 : About to run /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/f4f8010a-b09d-4150-9f4d-8909b0b74091/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/avocado-vt-vm1-swtpm.log --terminate --tpm2 --key pwdfd=27,mode=aes-256-cbc --migration-key pwdfd=24,mode=aes-256-cbc --migration release-lock-outgoing,incoming (3)# virsh console avocado-vt-vm1 [root@localhost ~]# tpm2_getrandom --hex 8 95f0e8fed5353d61 (4)# virsh dumpxml avocado-vt-vm1 |grep -E '/tpm|/seclabel' -B8 <tpm model='tpm-crb'> <backend type='emulator' version='2.0'> <encryption secret='64d3f77b-45d6-4c8a-90e9-53b76b9acbab'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> <alias name='tpm0'/> </tpm> ... <seclabel type='static' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c392,c662</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c392,c662</imagelabel> </seclabel> (5)# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose Migration: [100 %] 2. migrate with --undefinesource pass: # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose --p2p --undefinesource --persistent Migration: [100 %] hostB:# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose --p2p --undefinesource --persistent Migration: [100 %] 3.migrate transient vm pass: # virsh create avocado-vtpm.xml Domain 'avocado-vt-vm1' created from avocado-vtpm.xml # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] hostB:# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose Migration: [100 %] 4. migrate transient vm with persistent_state=yes pass: # virsh create avocado-transient_persistentstate.xml Domain 'avocado-vt-vm1' created from avocado-transient_persistentstate.xml <tpm model='tpm-crb'> <backend type='emulator' version='2.0' persistent_state='yes'> <encryption secret='64d3f77b-45d6-4c8a-90e9-53b76b9acbab'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> <alias name='tpm0'/> </tpm> ... <seclabel type='static' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c392,c662</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c392,c662</imagelabel> </seclabel> # virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose Migration: [100 %] hostB:# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostA/system --verbose Migration: [100 %] Mark pre-verification as pass per comment 19, comment 30 and this. Test pass for NFS storage: libvirt-9.0.0-4.el9.x86_64 qemu-kvm-7.2.0-8.el9.x86_64 swtpm-0.8.0-1.el9.x86_64 libtpms-0.9.1-2.20211126git1ff6fe1f43.el9.x86_64 10.*.*.*:/test/swtpm nfs4 70G 14G 57G 20% /var/lib/libvirt/swtpm following steps got expected results. swtpm socket --print-capabilities Define vm, start vm, – with full vtpm xml <tpm model='tpm-crb'> <backend type='emulator' version='2.0'> <encryption secret='edec963c-3f90-423e-ba51-d6b1e9fdd041'/> <active_pcr_banks> <sha256/> </active_pcr_banks> </backend> <alias name='tpm0'/> </tpm> live migration to and back. '--p2p --undefinesource --persistent' migration to and back live(with or without --p2p) migration to and back for transient vm A swtpm.log security context issue is filed to https://bugzilla.redhat.com/show_bug.cgi?id=2169262. Test pass for CEPH storage: libvirt-9.0.0-5.el9.x86_64 qemu-kvm-7.2.0-8.el9.x86_64 swtpm-0.8.0-1.el9.x86_64 libtpms-0.9.1-2.20211126git1ff6fe1f43.el9.x86_64 mount -t ceph 10.*.*.*:6789:/swtpm /var/lib/libvirt/swtpm/ -o name=admin Scenarios and results are same with comment30, comment31. when vm running: drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Feb 17 04:14 92601f75-b208-4f70-a0fe-8e60947ef763 drwx------. 2 tss tss system_u:object_r:svirt_image_t:s0:c392,c662 2 Feb 17 04:15 tpm2 -rw-------. 1 tss tss system_u:object_r:svirt_image_t:s0:c392,c662 9.2K Feb 17 04:15 tpm2-00.permall when vm shutoff: drwx--x--x. 3 root root system_u:object_r:unlabeled_t:s0 1 Feb 17 03:58 92601f75-b208-4f70-a0fe-8e60947ef763 drwx------. 2 tss tss system_u:object_r:virt_var_lib_t:s0 2 Feb 17 04:06 tpm2 -rw-------. 1 tss tss system_u:object_r:virt_var_lib_t:s0 6.1K Feb 17 04:06 tpm2-00.permall Additional test: Unix socket migration PASS for following scenarios: 1.Persistent vm migrating to and back 2.Transient vm migrating to and back 3.Transient vm with persistent_state='yes' migrating to and back [hostA]]# virsh migrate avocado-vt-vm1 --desturi qemu+unix:///system?socket=/tmp/22222-sock --live --verbose --p2p --migrateuri unix:///tmp/33333-sock Migration: [100 %] [hostB]]# virsh migrate avocado-vt-vm1 --desturi qemu+unix:///system?socket=/tmp/44444-sock --live --verbose --p2p --migrateuri unix:///tmp/55555-sock Migration: [100 %] Retest for NFS with new libvirt build also PASS. Mark as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |